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Joint  Filter  and  Waveform  Design  for  Radar 
STAP  in  Signal  Dependent  Interference 

Pawan  Setlur,  Member,  IEEE ,  Muralidhar  Rangaswamy,  Fellow,  IEEE 


Abstract 

Waveform  design  is  a  pivotal  component  of  the  fully  adaptive  radar  construct.  In  this  paper  we 
consider  waveform  design  for  radar  space  time  adaptive  processing  (STAP),  accounting  for  the  waveform 
dependence  of  the  clutter  correlation  matrix.  Due  to  this  dependence,  in  general,  the  joint  problem  of 
receiver  filter  optimization  and  radar  waveform  design  becomes  an  intractable,  non-convex  optimization 
problem.  Nevertheless,  it  is  however  shown  to  be  individually  convex  either  in  the  filter  or  in  the  waveform 
variables.  We  derive  constrained  versions  of:  a)  the  alternating  minimization  algorithm,  b)  proximal 
alternating  minimization,  and  c)  the  constant  modulus  alternating  minimization,  which,  at  each  step, 
iteratively  optimizes  either  the  STAP  filter  or  the  waveform  independently.  A  fast  and  slow  time  model 
permits  waveform  design  in  radar  STAP  but  the  primary  bottleneck  is  the  computational  complexity  of 
the  algorithms. 


Index  Terms 

Waveform  design,  waveform  scheduling,  space  time  adaptive  radar.  Capon  beamformer,  constant 
modulus,  convex  optimization,  alternating  minimization,  regularization,  proximal  algorithms. 

I.  Introduction 

The  objective  of  this  report  is  to  address  waveform  design  in  radar  space  time  adaptive  processing 
(STAP)  [1] — [4].  An  air-borne  radar  is  assumed  with  an  array  of  sensor  elements  observing  a  moving 
target  on  the  ground.  We  will  assume  that  the  waveform  design  and  scheduling  are  performed  over  one 
CPI  rather  than  on  an  individual  pulse  repetition  interval  (PRI).To  facilitate  waveform  design,  we  develop 
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a  STAP  model  considering  the  fast  time  samples  along  with  the  slow  time  processing.  This  is  different 
from  traditional  STAP  which  generally  considers  the  data  after  matched  filtering  [1],  [2],  Nonetheless 
STAP  research  efforts  have  been  proposed  which  consider  inclusion  of  fast  time  samples  in  space  time 
processing,  see  for  example  [1],  [5],  [6]  and  references  therein. 

In  line  with  traditional  STAP,  we  formulate  the  waveform  design,  as  an  minimum  variance  distortion¬ 
less  response  (MVDR)  type  optimization  [7],  As  we  will  see  in  the  sequel,  inclusion  of  the  waveform 
increases  the  dimensionality  of  the  correlation  matrix.  Classical  Radar  STAP  is  computationally  expensive 
but  the  waveform  adaptive  STAP  increases  the  complexity  by  several  orders  of  magnitude.  Therefore, 
benefits  of  waveform  design  in  STAP  come  at  the  expense  of  increased  computational  complexity.  The 
noise,  clutter,  and  interference  are  modeled  stochastically  and  are  assumed  to  be  mutually  uncorrelated. 
Endemic  to  airborne  STAP,  clutter  is  persistent  in  most  range  gates  resulting  from  ground  reflections. 
The  clutter  correlation  matrix  is  a  function  of  the  waveform  causing  the  joint  reliever  filter  and  waveform 
optimization  to  be  non-convex  with  no  closed  form  solution.  However,  it  is  analytically  shown  here  that 
the  STAP  MVDR  objective  is  convex  with  respect  to  (w.r.t.)  the  receiver  filter  for  a  fixed  but  arbitrary 
waveform,  and  vice  versa.  Therefore,  alternating  minimization  approaches  arise  as  natural  candidate 
solutions.  As  such,  alternating  minimization  itself  has  a  rich  history  in  the  optimization  literature,  possibly 
motivated  directly  from  the  works  in  [8] — [11],  with  some  not  so  recent  seminal  contributions  [12]— [  15] 
and  recent  contributions  (not  exhaustive)  [16],  [17].  Other  celebrated  algorithms  such  as  the  Arimoto- 
Blahut  algorithm  to  calculate  channel  capacity,  and  the  expectation-maximization  (EM)  algorithm  are  all 
examples  of  the  alternating  minimization. 

Here  we  address  the  joint  optimization  problem  via  a  constrained  alternating  minimization  approach, 
which  has  the  favorable  property  of  monotonicity  in  successive  objective  evaluations.  Convergence, 
performance  guarantees  and  other  properties  pertinent  to  this  algorithm  are  further  addressed.  Full  rank 
correlation  matrices  are  required  in  implementing  the  constrained  alternating  minimization  approach.  In 
practice,  radar  STAP  contends  with  rank  deficient  correlation  matrices  due  to  lack  of  homogeneous  training 
data.  In  this  case,  the  constrained  alternating  minimization  approach  is  not  implementable.  To  addresses 
this  issue,  we  consider  regularization  of  the  STAP  objective  via  strongly  convex  functions  resulting  in  the 
constrained  proximal  alternating  minimization  [18].  Proximal  algorithms,  originally  proposed  by  [19],  [20] 
are  well  suited  candidate  techniques  for  constrained,  large  scale  optimization  [16],  [21]-[24],  applicable 
readily  to  our  waveform  adaptive  STAP  problem.  In  fact,  as  we  will  see  subsequently  the  constrained 
proximal  alternating  minimization  results  in  diagonal  loading  solutions,  and  for  optimization-specific 
interpretations,  the  load  factors  may  be  related  to  the  Lipschitz  constants  (w.r.t.  the  gradient). 


2 

Approved  for  public  release;  distribution  unlimited. 


P.  SETLUR  AND  M.  RANGASWAMY:  AFRL  SENSORS  DIRECTORATE  TECH.  REPORT.  2014. 


3 


Signal  dependent  interference:  Chicken  or  the  Egg?  The  fundamental  problem  in  practical  radar 
waveform  design  is  analogous  to  the  chicken  or  the  egg  problem.  Signal  dependent  interference,  i.e., 
clutter,  can  only  be  perfectly  characterized  by  transmitting  a  signal.  Herein  lies  the  central  problem. 
The  estimated  clutter  properties  could  therefore  be  dependent  on  what  was  transmitted  in  the  first  place. 
This  is  especially  true  for  frequency  selective  and  dispersive  clutter  responses  frequently  encountered 
in  radar  operations,  for  example,  urban  terrain.  Therefore,  any  claim  of  optimality  is  myopic.  Sadly  the 
same  problem  would  also  persist  when  the  target  impulse  responses  are  used  to  shape  the  waveform. 
Unfortunately,  and  as  famously  stated  by  Woodward  [25],  [26],  “. .  .what  to  transmit  remains  substantially 
unanswered”  [27],  [28]. 

We  will  assume  like  other  works  in  the  signal  dependent  interference  waveform  design  [29]— [37],  that 
the  clutter  response  is  known  a  priori.  To  a  certain  extent,  this  may  be  obtained  via  a  combination  of, 
either  previous  r  adar  transmission  [38],  or  assuming  that  the  topography  is  known  from  ground  elevation 
maps,  synthetic  aperture  radar  imagery  [39],  or  access  to  knowledge  aided  databases  as  in  the  DARPA’s 
KASSPER  program  [40]. 

Literature:  The  signal  dependent  interference  waveform  design  problem  has  had  a  rich  history  [41], 
[42],  Iterative  approaches  but  not  limited  alternating  minimization  type  techniques  have  been  the  subject 
of  work  in  [29]— [37],  [43]  for  SISO,  MIMO  radars  but  never  in  radar  STAP.  Waveform  design  for  STAP 
without  considering  the  signal  dependent  interference  clutter  was  addressed  in  [44],  where  the  authors 
premise  is  that  the  degrees  of  freedom  from  the  waveform  could  be  used  in  suppressing  the  interference 
and  noise,  while  the  degrees  of  freedom  from  the  filter  could  be  used  exclusively  for  suppressing  the 
clutter.  A  joint  STAP  waveform  and  STAP  filter  design  was  never  considered.  Further,  their  premise  is 
erroneous  for  the  following  several  reasons.  For  any  radar  application,  but  especially  in  STAP,  obtaining 
range  cells  which  arc  interference  free  or  clutter  free  is  impossible.  Nonetheless  assuming  this  was 
possible,  then,  the  weight  vector  for  exclusive  clutter  suppression  uses  the  inverse  of  the  clutter  inter¬ 
ference  correlation  matrix  only,  and  not,  as  stated  in  [44],  the  inverse  of  the  (clutter+noise+interference) 
correlation  matrix.  Furthermore  such  a  detector  may  have  disastrous  consequences,  because  control  in 
the  false  alarm  rates  becomes  impossible  due  to  the  self  induced  coloring  on  other  range  cells  which  arc 
contaminated  by  the  clutter  plus  interference  plus  noise. 

Other  contributions  in  waveform  design  and  waveform  scheduling  for  extended  targets  in  radar  using 
information  theoretic  measures,  tracking  etc  can  be  seen  in  [45]-[50],  [5 1]— [56],  and  the  references 
therein. 

We  outline  some  of  the  contributions  for  the  signal  dependent  interference  problem  which  have  thus 
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far  appealed  in  the  literature. 

Approaches  different  from  Alternating  minimization-.  In  [29],  [32],  [34],  [43]  a  single  sensor  radar  was 
assumed.  In  [29],  the  authors  used  the  symmetry  property  of  the  cross-ambiguity  function  to  design  an 
iterative  algorithm  for  the  signal  dependent  interference  problem.  Their  algorithm  cannot  be  modified 
easily  for  the  multi-sensor  framework  and  when  noise  is  in  general  colored.  The  problem  was  addressed 
from  a  detection  perspective  in  [32],  and  lead  to  a  waterfilling  [47]  type  solution.  A  similar  waterfilling 
type  metric  albeit  in  the  discrete  time  domain  was  obtained  in  [43],  where  the  authors  also  imposed 
constant  modulus  and  peak  to  average  power  ratio  (PAPR)  waveform  constraints.  An  iterative  algorithm 
was  derived  in  [34],  where  monotonic  increase  in  SINR  was  not  guaranteed,  and  was  shown  that  waveform 
could  always  be  chosen  as  minimum  phase. 

Alternating  minimization  type  approaches :  In  [33],  a  MIMO  sensor  framework  was  employed,  con¬ 
vergence  was  not  addressed,  convexity  was  not  proven,  and  no  practical  waveform  constraints  were 
imposed  on  the  design.  See  also  in  this  report.  Section  III,  paragraph  following  Rem.  5  where  some 
of  the  conclusions  drawn  in  [33]  are  further  discussed.  Alternating  minimization  was  used  in  [35]— [37] 
but  for  reasons  unknown,  was  called  as  sequential  optimization.  In  [35],  [36],  a  SISO  model  advocating 
joint  filter  and  radar  code  design  (after  matched  filtering)  was  employed.  Analysis  of  the  convexity  of 
the  objective  in  the  individual  filter  or  radar  code  was  never  shown.  Convergence  in  iterates  was  not 
proven  formally,  neither  was  it  shown  via  simulations.  The  constant  modulus  constraint  was  not  invoked 
directly  but  through  a  similarity  constraint.  In  [37],  the  authors  used  a  MIMO  radar  framework,  and 
relaxation  techniques  were  employed  in  their  iterative  algorithm.  Neither  convergence  nor  convexity  was 
demonstrated  analytically.  Constant  modulus  constraint  and  similarity  constraints  were  enforced  separately 
in  the  waveform  design. 

Notation:  The  variable  N  is  used  interchangeably  with  the  number  of  the  fast  time  samples,  as  well 
as,  the  conventional  dimension  of  arbitrary  real  or  complex  (sub)spaces.  Its  meaning  is  readily  interpreted 
from  context.  The  symbol  1 1  •  1 1  always  denotes  the  I2  norm.  Vectors  are  always  lowercase  bold,  matrices 
are  bold  uppercase,  A  is  typically  reserved  for  eigenvalues  (with  Ac  being  an  exception  it  used  for  the 
spatial  frequency,  defined  later)  and  7  is  strictly  reserved  for  the  Lagrange  multipliers  (7 pq  is  an  exception 
used  for  the  radar  cross  section  of  the  p- th  scatterer  in  the  q-th  clutter  patch).  Solutions  to  the  optimization 
are  denoted  as  (-)0,  i.e.  the  subscript  o.  the  complex  conjugate  is  denoted  with  (•)*.  The  set  of  reals, 
complex  numbers,  and  natural  numbers  are  denoted  as  R,  <D,N,  respectively.  Other  symbols  are  defined 
upon  first  use  and  are  standard  in  the  literature. 

Organization:  The  STAP  fast  time-slow  time  model  is  delineated  in  Section  II,  and  in  Section  III,  the 
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filter  and  waveform  optimization  is  derived.  Some  preliminary  simulations  are  presented  in  section  IV 
and  the  resulting  conclusions  arc  drawn  in  Section  V. 

II.  STAP  Model 

The  radar  consists  of  a  calibrated  air-borne  linear  array,  comprising  M  sensor  elements,  each  having 
an  identical  antenna  pattern.  Without  loss  of  generality,  assume  that  the  first  sensor  in  the  array  is  the 
phase  center,  and  acts  as  both  a  transmitter  and  receiver,  the  rest  of  the  elements  arc  purely  receivers. 
The  first  sensor  is  located  at  xr  e  R3  and  the  ground  based  point  target  at  xt  £  R3.  The  r  adar  transmits 
the  burst  of  pulses: 

L 

u{t )  =  ^  s(t  -  lTp)  exp(j27r/0(i  -  lTp)),t  £  [0,  T)  (1) 

1=1 

where,  f0  is  the  carrier  frequency,  and  Tp  =  1  /  fp  is  the  inverse  of  the  pulse  repetition  frequency,  fp.  The 
pulse  width  and  bandwidth  are  denoted  as  T,  B,  respectively.  The  coherent  processing  interval  (CPI) 
consists  of  L  pulses,  each  of  width  equal  to  T.  The  geometry  of  the  scene  is  shown  in  Fig.  1,  where  0t 
and  (pt  denote  the  azimuth  and  elevation.  The  radar  and  target  arc  both  assumed  to  be  moving. 

For  the  time  being,  we  ignore  the  noise,  clutter  and  interference  and  assume  a  non-fluctuating  target. 
Then  the  desired  target's  received  signal  for  the  Z-th  pulse,  and  at  the  m-th  sensor  element  is  given  by 

Sml(t)  =  Pts(t  -  lTp  -  Tm)e(jMfo+M(t-lTp-rm))  (2) 

where  the  target's  observed  Doppler  shift  is  denoted  as  /^m,  and  its  complex  back-scattering  coefficient 
as  pt.  Assume  that  the  array  is  along  the  local  x  axis  as  shown  in  Fig.  1.  Then,  the  coordinates  of  the 
m-th  element  is  given  by  xt  +  rod,  d  :=  [ d ,  0, 0]T,  m  =  0, 1,  2  . . . ,  M  —  1,  where  d  is  the  inter-element 
spacing.  The  delay  rm  could  be  re-written  as 


Tm  =  l|xr  -  xt||/c+  ||xr  +md  —  xt  1 1  /c 

_  |jxr  ~  xt||  1 1 xr  —  xt  1 1  /  1 1 md 1 1 2  2??rdT(xr  -  xQ 

c  c  V  1 1 xr  —  xt  1 1 2  ||xr-xt||2 

(a)  I  |xr  —  xt  1 1  ||xr  -  xt||  A  mdr(xr  -  Xt)\ 

C  C  V  ||Xr.-Xt||2  ) 

_  2jjxr  -  Xt||  mdT(xr  -  xt) 
c  c|  |xr  —  xt  1 1 


(3) 

(4) 


where  in  approximation  (a),  the  term  oc  ||?nd||2  was  ignored,  i.e.  it  is  assumed  that  cZ/||xr  —  xt||  <<  1,  and  then 
a  binomial  approximation  was  employed.  From  geometric  manipulations,  we  also  have: 


xr  ~  xt 

||xr  -  xt 


[sin(^t)  sin(0t),  sin(</>t)  cos (0t),  cos {(pt)]T ■ 
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Using  the  above  equation  in  (4),  the  delay  rTO,  m  =  0, 1, . . . ,  M  —  1  can  be  rewritten  as 


7"m 


mdsin(<j)t)  sin(0t) 
c 


(5) 


The  Doppler  shift,  i.e.  fdm  is  computed  as 

(xr  -  xt)T(xr  -  xt) 


fdm  =  2fo 

mdT  r 

+  Jo  7" 


c||xr  -  xt| 


xr-xt  (xr  -  xt)(xr  -  xt)T(xr  -  xt) 


|xr  -  Xt 


|xr  -  Xt 


(6) 


where  X(.)  is  the  vector  differential  of  X(.)  w.r.t.  time.  In  practice  d  is  a  fraction  of  the  wavelength,  and  assuming 
that  cZ/||xr  —  xt 1 1  <<  1  we  approximate  the  second  term  in  (6)  as  0.  The  Doppler  shift  is  no  longer  a  function  of 
the  sensor  index,  to,  and  is  rewritten  as 


fd 


m 


fd 


-  xt)T(xr  -  Xt) 
c||xr  —  xt  1 1 


(7) 


A.  Vector  signal  model 

Let  s(t)  be  sampled  discretely  resulting  in  N  discrete  time  samples.  Consider  for  now  the  single  range  gate 
corresponding  to  the  time  delay  rt.  After  a  suitable  alignment  to  a  common  local  time  (or  range)  reference,  and 
invoking  some  standard  assumptions,  see  also  [57,  A.1-A.3],  the  radar  returns  in  Z-th  PRI  written  as  a  vector  defined 
as  y /  £  is  given  by 


y i  =  Pts  ®  a(0t,  (ft)  exp(-j27r/d(Z  -  1  )TP)  (8) 

a(0t,  (ft)  ■=  [1,  e~ . . . ,  e-J27r(M-1)1?]T  e  (CM 

where  s  :=  [si,  S2,  •  ■ . ,  Sn]t  G  <Cn  and  0  :=  dsin(0t)  sin(</>t)/A0  is  defined  as  the  spatial  frequency.  Further  it 
is  noted  that  in  (8),  the  constant  phase  terms  have  been  absorbed  into  pt.  Considering  the  L  pulses  together,  i.e. 
concatenating  the  desired  target’s  response  for  the  entire  CPI  in  a  tall  vector  y,  is  defined  as 

y  <=  cWMi  =  [y0T,yiT,.  ..,y Lif  =  Ptv(fd)®s®a{9t,<ft) 

v(/d)  :=  [l,e~j27rfdTp,...,e~j27rfd('L~1^Tp]T.  (9) 

The  vector  y  consists  of  both  the  spatial  and  the  temporal  steering  vectors  as  in  classical  STAP,  as  well  as  the 
waveform  dependency,  via  waveform  vector  s.  Due  to  inclusion  of  the  fast  time  samples  in  the  waveform  s,  the 
STAP  data  cube  is  modified  to  reflect  this  change,  and  is  depicted  in  Fig.  2. 

At  the  considered  range  gate,  the  measured  snapshot  vector  consists  of  the  target  returns  and  the  undesired 
returns,  i.e.  clutter  returns,  interference  and  noise.  The  contaminated  snapshot  at  the  considered  range  gate  is  then 
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given  by 

y  =  y +  yi +  yc +  yn  =  y +  yu  (10) 

where  yi,yc,yn  are  the  contributions  from  the  interference,  clutter  and  noise,  respectively,  and  are  assumed  to  be 
statistically  uncorrelated  with  one  another.  The  contribution  of  the  undesired  returns  are  treated  in  detail,  starting 
with  the  noise  as  it  is  the  simplest. 

Noise:  The  noise  is  assumed  to  be  zero  mean,  identically  distributed  across  the  sensors,  across  pulses,  and  in 
the  fast  time  samples.  The  correlation  matrix  of  yn  is  denoted  as  Rn  €  <l'XM  L/XM  L  _ 

Interference:  The  interference  consists  of  jammers  and  other  intentional  /  un-intentional  sources  which  may  be 
ground  based,  air-borne  or  both.  Let  us  assume  that  there  are  K  interference  sources.  Further,  since  nothing  is 
known  about  the  jammers  waveform  characteristics,  the  waveform  itself  is  assumed  to  be  a  stationary  zero  mean 
random  process.  Consider  the  /c-th  interference  source  in  the  Z-th  PRI,  and  at  spatial  co-ordinates  (6k,  4>k).  Its 
corresponding  snapshot  contribution  is  modeled  as, 

y ki  ^ hi  ®  4*k),  k  1,2,...,  K,  l  0, 1, . . . ,  L  1 

where  aki  =  [0^(0),  aki(l), . . .  ,aki(N  —  1)]T  G  <CN  is  the  random  discrete  segment  of  the  jammer  waveform, 
as  seen  by  the  radar  in  the  Z-th  PRI.  Stacking  y/../  for  a  fixed  k  as  a  tall  vector,  we  have 

yk  =  ak  <g>  a (0fc,0fc)  =  [yLyli,  •  •  ■  e  <CNML 

atk  ■  =  [oLkoT i  £*fci7  j  •  •  ■ ,  akL—iT]T  €  (DWi  (11) 

Using  the  Kronecker  mixed  product  property,  (see  for  e.g.  [58]),  the  correlation  matrix  of  y/c  is  expressed  as 
E{yT-y^}  =  R^,  <g>  a(6k,  <j>k)a(9k,  4>k)H  where,  Wi{akakH}  ■—  R^  .  For  K  mutually  uncorrelated  interferers,  the 

K  I<  K 

correlation  matrix  is  R;  =  £  E{yfcyf }  =  E  ®  a(6>fc,  <t>k)a.(0k,  cj)k)H  =  ]C  (! nl  ®  a(0k,  <j)k))Rka(lNL  <g> 

k= 1  k= 1  k=l 

a {0k^4>k)H),  and  is  simplified  as 

Ri  =  A(0,  </>)RaA((Z,  4>)h  (12) 

where  Ra  :=  Diag{R^,R^, . . . ,  R* }  €  C nmlkxnmlk  and  a(6»,  0)  G  (D nmlxnmlk 
=  [Ijvl  ®  a(#i,  <^i),  Ijvl  ®  a(02,  <(>2),  •  •  • ,  I  nl  <8>  bl{9k,  4>k  )],  here  I  ml  the  identity  matrix  of  size  NL  x  NL,  and 
Diag{-,  the  matrix  diagonal  operator  which  converts  the  matrix  arguments  into  a  bigger  diagonal  matrix. 

For  example,  DiagjA,  B.  C}  =  b  o  . 

Clutter:  The  ground  is  a  major  source  of  clutter  in  air-borne  radar  applications  and  is  persistent  in  all  range 
gates  upto  the  gate  corresponding  to  the  platform  horizon.  Other  sources  of  clutter  surely  exist,  such  as  buildings, 
trees,  as  well  as  other  un-interesting  targets,  which  are  ignored.  We  therefore  consider  only  ground  clutter  and  treat 
it  stochastically. 
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Let  us  assume  that  there  are  Q  clutter  patches  indexed  by  parameter  q.  Each  of  these  clutter  patches  are  comprised 
of  say  P  scatterers.  The  radar  return  from  the  p- th  scatterer  in  the  p-th  clutter  patch  is  given  by 


7 pqv{fcpq)  <g>  s  <g  a  {@pq>  4> pq ) 


where  7 pq  is  its  random  complex  reflectivity,  fcpq  is  the  Doppler  shift  observed  from  the  p-th  scatterer  in  the  <y-th 
clutter  patch,  and  9pq,(j>pq  are  the  azimuth  and  elevation  angles  of  this  scatterer.The  Doppler  fcpq  is  given  by. 


/c 


pq 


*^/oxr  (xr  xpq) 

c|  |xr  Xpq  1 1 


(13) 


where  xpq  is  the  location  of  the  p- th  scatter  in  the  p-th  clutter  patch.  Since  the  clutter  patch  is  stationary,  the 
Doppler  is  purely  from  the  motion  of  the  aircraft  as  seen  in  (13).  The  contribution  from  the  p-th  clutter  patch  to 
the  received  signal  is  given  by 

p 

yq  =  'YpqVifCpq)  ®  S  <g  a(9pq,  ^ PQ )’  (14) 

p=i 


with  corresponding  correlation  matrix 


R«  :=  BqR^Bq"  (15) 

where,  Bq  =  [v(/ci9)  ®  s  ®  a(6>i9,  v(/c2g)  ®  s  ®  a(6^g,  <^2g) . . . ,  v(/cP(?)  <g  s  <g  a(6>Pg,  </>P(?)]  <E  <pNMLxP 

and  R^9  is  the  correlation  matrix  of  the  random  vector,  [71^,  7 2g, . . . ,  7P?]T.  It  is  readily  shown  that  the  matrix  Bq 
could  be  simplified  as,  Bq  :=  Bq(Ip  (g  s),  where  Bq  :=  [v(/cig)  (g  Aiq,  v(/c2?)  ®  A2q, . . . ,  v(/cP(?)  (g  APq]  e 
ml x pn ,  ancj  stj-uctuj-g  0f  the  matrix  Apq  £  q^nmxn  (straightforward  but  not  shown  here)  is  defined 

such  that  s  (g  a (9pq,  4>pq)  =  Apqs,p  =  1, . . . ,  P.  Assuming  that  a  particular  scatterer  from  one  clutter  patch  is 

uncorrelated  to  any  other  scatterer  belonging  to  any  other  clutter  patch,  we  have  the  net  contribution  of  clutter 

Q 

yc  =  y,r  with  corresponding  correlation  matrix  given  by 

9=1 

Q 

RC  =  ^R9.  (16) 

9=1 

The  clutter  model  could  further  be  simplified  by  the  following  arguments.  Assuming  a  large  range  resolution  which 
is  typically  the  case  for  radar  STAP  [2]  the  scatterers  in  a  particular  clutter  patch  are  in  the  same  range  gate  and 
hence  are  assumed  to  possess  approximately  identical  Doppler  shifts,  i.e.  fcpq  «  fcq  =  •  Similarly 

for  the  far  field  operation,  and  considering  scatterers  in  the  same  azimuth  resolution  cell,  and  from  the  large  range 
resolution  argument,  we  may  assume  9pq  «  9q  and  cf>pq  «  tpq,  i.e.  their  nominal  angular  centers.  These  assumptions 
can  now  be  incorporated  in  matrix  Bq  to  simplify  the  clutter  model,  see  also  [57]. 
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Range 

gates 

- ► 


Fig.  1:  Radar  scene  considering  the  ground  based  target  at  Fig.  2:  STAP  data  cube  before  matched  filtering  or  range 
azimuth  (0t),  elevation  (ft)-  The  ( x,y,z )  axis  are  local  to  the  compression,  depicting  the  considered  range  gate/cell  and  fast 
aircraft  carrying  the  array.  time  slices  (dashed  lines). 


III.  Waveform  Design 

The  radar  return  at  the  considered  range  gate  is  processed  by  a  filter  characterized  by  a  weight  vector,  w,  whose 
output  is  given  by  w,;y.  Since  the  vector  s  £  <T'V  prominently  figures  in  the  steering  vectors,  the  objective  is 
to  jointly  obtain  the  desired  weight  vector,  w  and  waveform  vector,  s.  It  is  desired  that  the  weight  vector  will 
minimize  the  output  power,  E{|wHyu|2}  =  wHRu(s)w.  Mathematically,  we  may  formulate  this  problem  as: 

min  wffRu(s)w  (17) 

W,S 

s.  t  wH(v(/d)  (g>  s  ®  a(0t,  ft))  =  re 
sHs  <  Pa 

In  (17),  the  first  constraint  is  the  renowned,well  known  Capon  constraint  with  re  £  R,  typically  re  =  1.  An  energy 
constraint  enforced  via  the  second  constraint  is  to  addresses  hardware  limitation.  Before  we  derive  the  solutions 
to  the  optimization  problem,  it  is  useful  to  recall  Lem.  1,  which  is  well-known,  used  throughout  this  report  but 
not  stated  explicitly.  This  fundamental  result  discusses  the  technique  to  compute  stationary  points  of  a  real  valued 
function  w.r.t.  its  complex  valued  argument  and  its  conjugate. 

Lemma  1.  Let  /(x,  x*)  :  (C ,v  -A  R.  The  stationary  point  of  /(x,  x*)  =  /(xr,  x  i)  is  found  from  the  three  equivalent 
conditions,  1.  VXr/(xr,x i)  =  0  and  VXi/(xr,x i)  =  0,  or  2.  Vx/(x,x*)  =  0,  or  3.  Vx*/(x,x*)  =  0.  Here 
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/  :  Kn  x  IR'V  — >  IR  is  the  real  equivalent  of  f  (•,•),  xr  =  Re{x},Xi  =  Im{x},  where  we  define  the  gradient 
Vx/(x,  x*)  :=  i  dQx’2^  i  ~  ~  ~  ,  dQx^'l]T  with  Xi  as  the  i-th  element  ofx,i  =  1,2,...  N,  and  0  is  a  column 

vector  of  all  zeros  of  dimension  N. 


Proof  This  arises  from  the  Wirtinger  calculus  see  [59]  1  for  a  recent  formal  proof. 


□ 


Optimizing  (17)  w.r.t.  w  first,  the  solution  to  (17)  is  well  known,  and  expressed  as 


W  =  _ ftR^fsXvl/rf)  ®s®a(6>t,</>t)) _ 

(v(/d)  0  s  0  a(0t,  ^t))ifRu1(s)(v(/d)  0  s  0  a(0t,  ft)) 

where  Ru(s)  =  Ri  +  Rc(s)  +  Rn.  We  further  emphasize  that  the  weight  vector  is  an  explicit  function  of  the 
waveform.  Now  substituting  w0  back  into  the  cost  function  in  (17),  the  minimization  is  purely  w.r.t.  s,  and  cast  as, 

K2 

mm - , - 

s  (v(/d)  ®  s  ®  a(0t,  </>t))ffRu1(s)(v(/d)  0  s  0  a (0t,  ft)) 

s.  t.  sHs  <  Po  (19) 


A  solution  to  (19)  is  not  immediate,  given  the  dependence  of  Ru  on  the  waveform  vector  s.  We  consider  first,  the 
case  when  the  clutter  dependence  on  the  waveform  is  ignored.  Solutions  to  the  design  when  clutter  is  considered 
are  treated  subsequently. 


A.  Rayleigh-Ritz:  Minimum  eigenvector  solution 

Ignoring  the  dependency  of  Ru  on  s,  we  readily  see  that  the  (19)  can  be  recast  as  a  Rayleigh-Ritz  optimization, 
whose  solution  is  given  by 


v(/d)  0  s  0  a(0t,  ft)  =  /imin(Ru) 


(20) 


where  /i,min(Ru)  is  the  eigenvector  corresponding  to  the  minimum  eigenvalue  of  Ru.  This  tensor  equation  implicitly 
defines  the  optimal  s.  It  is  readily  seen  that,  v(/d)  0  s  0  a (0t)  ff)  =  Gs,  where  G  =  v(/d)  0  At,  and 


At 


a  (0t,ft)  0  0 

0  a  (et,ft)  0 

0  0  a  (0t,ft) 


0 

0 


g  (£>MNxN 


In  general,  the  system  is  over-determined,  and  we  solve  this  equation  approximately  via  least  squares  (LS), 


s  =  (GHG)-1GffAimin(Ru). 


(21) 


'also  see  refs.  Brandwood,  and  A.  van  den  Bos  in  [59] 
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Moreover  from  (20)  and  the  structure  of  the  temporal  and  spatial  steering  vectors,  as  well  as  the  orthonormality  of 
the  eigenvectors,  it  is  readily  seen  that, 

I |v(/<i)  0  s  0  a(f?t,  (f>t)\\2  =  ||v(/d)||2| |s| |2||a(0t,  4>t)\\2  =  IM|2 

||/xmin(Ru)||2  =  1.  (22) 


Hence  the  LS  solution  in  (21)  must  be  scaled  to  satisfy  the  desired  energy  requirements  of  the  radar  system. 

Decoupling  LS:  The  LS  solution  in  (21)  can  be  further  simplified  due  to  the  following  linear  relation  between 
elements  of  v(/<j),  a(0t,  ^),  s  and  elements  of  /xmin(Ru),  expressed  as 


OlCljnSn  —  ft/i?  I  —  1)2,...,  L,  to  —  1,2,...,  ilL,  n  —  1,2,...,  N 
h  =  (l  —  1  )MN  +  (n-  1  )M  +  m. 


(23) 


where  Vi,  am,  sn  are  the  f-th,  m-th,  n-th  elements  of  v(/<j),  a (0t,  <j>t),  s,  and  /j/,  is  the  h- th  element  of  Ru), 

respectively.  Therefore,  the  LS  solution  in  (21)  decouples  as 


$n 


(v(/d)  ®  a (0t,  <f>t))H {v(fd)  0  a(0t,  <j>t))  ’ 


n  =  1,2,..., AT 


(24) 


where  the  vector  fin  G  (DML  for  a  particular  n  consists  of  the  ML  appropriate  elements,  /j./, ,  h  =  (l  —  1  )MN  + 
(n  —  1  )M  +  to,  m  =  1, 2, . . . ,  M,  l  =  1, 2, . . . ,  L,  as  highlighted  in  (23). 

The  min.  eigenvector  solution  is  most  relevant  when  noise  and  interference  are  considered  and  clutter  is  ignored 
in  the  waveform  design  [3],  it  has  some  nice  spectral  properties  similar  (but  not  identical)  to  water-filling  [3], 
[47],  Therefore  this  solution,  although  suboptimal,  is  a  good  initial  waveform  to  interrogate  the  radar  scene,  but  is 
unfortunately  well  known  to  suffer  from  poor  modulus  and  sidelobe  properties.  Nonetheless,  in  certain  exceptional 
cases  and  in  the  presence  of  clutter,  this  suboptimal  solution  is  shown  to  be  optimal,  and  is  discussed  at  a  later 
stage.  The  ensuing  definitions  and  lemma  proves  useful  subsequently. 


Lemma  2.  (a)  If  vectors  a.,  f3  and  7  consist  of  the  eigenvalues  of  the  square  but  not  necessarily  Hermitian  matrices, 
X  €  (f,NxNt  Y  £  (DMxM  and  X®  Y,  respectively.  Then  7  =  a<g> (3.  (b)  Also,  rank(XQY)  =  r a.nk ( X ) 0 r a.nk ( Y ) . 

Proof  For  (a),  let  x,:,  i  =  1,2,  ...,7V  and  y j,j  =  1,2,  ...,M  are  the  eigenvectors  corresponding  to  a*,  j3j  i.e. 
the  i-th  and  j-th  eigenvalues,  of  X,  Y,  respectively.  Then,  from  the  mixed  property  of  the  Kronecker  product, 
Xx,  0  Yy j  =  (X  0  Y)(xj  0  y:])  but  the  eigenvector  relations  imply  that  Xx.;  =  oax, ,  Y y;  =  f  j y3 .  This  implies 
that  the  z  j  -th  eigenvalue  of  of  X0  Y  is  7^  =  Cliff  with  associated  eigenvector  x,:  0  y:) .  Since  the  rank  is  equal  to 
the  number  of  non-zero  eigenvalues  for  square  matrices,  the  second  follows  directly  from  (a).  Hence  proved.  □ 

Definition  1.  ( Convexity )  A  function  /(x)  :  — >  3R  is  convex  if  : 

(a)  /(fat!  +  (1  -  t)x2)  <  tf  (xx )  +  (1  -  t)/(x2)  for  any  t  G  [0, 1] 
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(b)  If  /(x)  is  first  order  differentiable,  then  it  is  convex  if  /(x_,-)  >  /(x,)  +  VXi/(xj)T(/(x,,)  —  /(xj)) 
where  in  (a)(b)  x,  £  IR7^,  i  =  1,2,  j  =  1,2 ,j 


From  our  extensive  simulations,  we  noticed  that  the  original  cost  function  in  (17)  is  not  jointly  convex  in  w 
and  s.  Nevertheless,  it  is  not  straightforward  to  prove  /  disprove  joint  convexity  w.r.t.  both  w  and  s  analytically. 
Consider,  then,  the  following  propositions: 


Proposition  1.  The  objective  function  in  (17)  is  individually  convex  w.r.t.  s,  for  any  fixed  but  arbitrary  w 


Proof.  Definition  1  cannot  be  directly  invoked  as  the  objective  g( s)  =  wH Ru(s)w  :  <CN  — >  1R  depends  on  the  wave¬ 
form  s,  which  is  complex.  Consider  the  following  transformation2,  s  =  Ds  where  s  £  IR27V  =  [Re{s}7  ,  Im{s}7  ]T 
and  D  =  [I^r,  jl^v]  £  (£<Nx2N.  ]\j0Wi  we  may  defiiie  an  equivalent  g( s)  :  IR2JV  — >  1R  to  invoke  the  definition  of 
convexity.  We  have  to  prove  that. 


/  Rn  +  Ri 


,H 


\ 


«  (IP®D(ts1  +  (l-t)82))R^ 

+  Bq 

V  «=1  (Ip  <g  (fsi  +  (1  -  i)s2)TDH)Bq  ) 


(  Rn  +  R, 


<  fw 


H 


Y  Bq(Ip  <8>  Ds!)R^(Ip  ®  sf  Dff)B 

9=1 

(  Rn  +  Ri 


+  (1  -  t)w 


H 


\ 


Q 

Y  Bq(Ip  ®  Ds2)Rp«(Ip  ®  s^Dff)Bq 

9=1 


(25) 


where  t.  £  [0,1]  and  Sj  £  dom{g(s)},(  =  1,2.  After  elementary  algebra,  the  convexity  requirement  in  (25) 
transforms  to: 


Q 

Y  Xq  (R79  ®  D(§1  “  S2)(si  -  S2)TDH)  Xq  >  0  (26) 

9=1 

where  xq  £  (CNP  :=  Bq  w.  In  other  words,  it  is  sufficient  to  show  that  iff  (26)  is  true  then  (25)  is  also  true 
and  therefore  convex.  We  notice  immediately  that  (26)  is  a  sum  of  Hermitian  quadratic  forms.  Consider  the  matrix 
RP®  (gi  D(§i  —  s2)(§i  —  s2)TDff,  we  know  that  Ri]'3  A  03,  since  it  is  a  covariance  matrix  and  by  definition  atleast 
positive  semi-definite  (PSD).  The  other  matrix,  i.e.  D(sj  —  s2)(s;l  —  s2)TD77  is  of  course  rank-1  Hermitian,  and  is 

“Ideally  one  must  decompose  the  function  into  real  and  imaginary  components  (as  accomplished  subsequently),  but  due  to 
Hermitian  symmetry,  real  valued-ness  e.t.c.,  we  take  this  shortcut,  here,  instead 

3  Here  is  the  Lowner  partial  order  [58] 
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clearly  PSD.  From  Lem.  2,  it  is  straightforward  to  show  that  R^9  ®  D(§i  —  s^jfsi  —  S9)7  Dlr  0  0,  Vg.  Then  from 
the  definition  of  positive  semi-definiteness,  each  of  the  Q  Hermitian  quadratic  forms  in  (26)  is  greater  than  zero, 
hence  their  sum  is  also  greater  than  zero.  □ 

Proposition  2.  The  objective  function  in  (17)  is  individually  convex  w.r.t.  w,  for  any  fixed  but  arbitrary  s. 

Proof.  Given  the  guaranteed  positive  semi-definiteness  of  Ru(s),  the  proof  is  straightforward  to  demonstrate  by 
invoking  the  convexity  definition  on  the  vector  consisting  of  the  real  and  imaginary  parts  of  w.  □ 

In  fact.  Prop.  1,  Prop.  2  may  be  sharpened  to  include  strong  convexity,  which,  as  we  will  show  subsequently  is 
desired  for  the  solutions  to  exist,  see  the  note  immediately  after  (43).  For  now,  however,  individual  convexity  is 
sufficient  to  proceed  with  our  analysis. 

Remark  1.  ( Characteristic  of  STAP  objective )  The  STAP  objective  in  (17)  has  at  most  one  minima  for  a  fixed  but 
arbitrary  w  €  <jj,NML  but  Vs  £  <CN .  Likewise,  it  has  at  most  one  minima  for  a  fixed  but  arbitrary  s  £  <CN  but 

Vw  £  <CNML 

This  is  concluded  readily  from  Prop.  1,  Prop.  2,  i.e.  the  individual  convexity.  An  illustrative  example  is  provided 
in  Fig.  3. 


Multiple  Local  Minima 


Fig.  3:  An  illustrative  non-convex  example  with  multiple  local  minima.  Contours  in  black  are  characteristic  of  the  objective. 
Contours  in  blue  violate  convexity  in  the  w,  and  s  dimension  individually,  and  are  therefore  not  characteristic  of  the  objective 
function. 
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B.  Constrained  alternating  minimization 

Motivated  from  Prop.  1,  and  Prop.  2,  we  propose  a  constrained  alternating  minimization  technique  which  is 
iterative.  Before  we  present  details  on  this  technique,  consider  the  following  minimization  problem,  which  optimizes 
s,  but  for  a  fixed  and  arbitrary  w: 


min  wpRu(s)w 

S 

s.  t.  wff(v(/d)  0  s  0  a(0t,  4>t))  =  K  (27) 

sHs  <  Pa. 

In  (27),  the  objective  function  could  be  rewritten  as, 

whRu(s)w  =wff  (Rn  +  Ri)w  (28) 

Q 

+  ^Tr{RP'3(I P  0  sH)xqXq  (Ip  0  s)}. 

4=1 

In  (28),  the  trace  operation  is  further  simplified  as: 

Tr{RP9(Ip  0  sH)xqXq  (Ip  0  s)} 

=  vec  ^(RP9(Ip  0  sfl)xqx®)Tj  vec(Ip  0  s) 

=  sflHr(R”0xqx^)Hs 

=  sH  Zq(w)s  (29) 

where  vec(Ip  0  s)  =  Hs,  with  H  S  IRP  NxN  =  [HiT,  H22  , . . . ,  HpJ  ]T .  The  matrix  Hr  S  ]R PNxN,k  = 
1,  2, . . . ,  P  is  further  decomposed  into  P,  N  x  N  matrices,  and  is  defined  such  that  the  fc-th  NxN  matrix  is  I  .y 
and  the  other  (N  —  1),  N  x  TV  matrices  are  all  equal  to  zero  matrices. 

Q 

Remark  2.  (a)  At  the  very  least,  Zq  A  0.  (b)  The  matrix  Zq  A  0  for  P  <  N,  always,  (c)  However,  it  may  be 

4=1 

Q 

positive  definite,  i.e.  Zq  >-  0  and  hence  Zq  g  0  for  P  >  N  and  for  R}'7  >-  0. 

4=1 

We  note  that  (a)  is  readily  implied  from  Prop.  1  since  a  Hermitian  quadratic  form  xffBx  is  convex  (strictly 
convex)  iff  B  y  0  (  B  >-  0).  Since  Rl"1  0  xqxp  A  0  always  (<  P  non-zero  eigenvalues  and  the  rest  are  zeros) 
and  that  P  <  N,  in  other  words,  the  transformation  HT(R)“?  0  xqxp )H  :  <Ep~NxN  x  Cp"ArxJV  — >  <pNxN  and 
from  the  structure  of  H,  the  result  (b)  is  obvious.  For  (c),  we  know  that  rank(H)  =  N,  hence  it  could  be  shown 
after  some  tedious  algebra  that  Zq  may  be  PD  only  when  P  >  N  and  that  Rl’'1  is  PD  in  the  first  place,  also  see 
for  example  [58,  pg.  399]. 
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Using  (28)  and  (29),  the  Lagrangian  of  (27)  is  readily  cast  as, 

Q 

£(s,  71,72)  =  wff(Ri  +  Rn)w  +  ^2  s^Zq(w)s 

g=l 

+  Re{7i(wffGs  -  k)}  +  72sHINs  -  72 PQ 


(30) 


where  71  €  (D  and  72  £  IR+  are  the  complex  and  real  Lagrange  parameters. 

Lagrange  Dual:  The  Lagrange  dual,  denoted  as  'H (71 , 72 )  =  inf  £(s,  71, 72).  Since  (30)  consists  of  Hermitian 

S 

quadratic  forms  and  other  linear  terms  of  s,  we  have  7f(7i,  72)  =  >C(s0(7i,  72),  71, 72),  where  s0(7i,  72)  is  obtained 
by  solving  the  first  order  optimality  conditions,  i.e. 


d£(s,  71,72) 
ds 


(31) 


where,  0  is  a  column  vector  of  size  N  and  consists  of  all  zeros.  Further,  in  (31),  while  taking  the  derivative  the 
usual  rules  of  complex  vector  differentiation  apply,  i.e.  treat  sH  independent  of  s.  The  solution  to  (31)  is  readily 
obtained  by  differentiating  (30),  and  expressed  as: 

Q  _! 

So  (71  >72)  =  +72!^)  GHW.  (32) 

9=1 

Using  (32),  the  dual  H( 71,72)  is  given  by: 


^(71,72)  =  (R;  +  Rn)w  -  KRe{7i}  -  7 2P0 

1  1 2  Q  _ ^ 

_n^wffG(^Zq(w)+72Iiv)  GHW.  (33) 

V9=l 

Equation  (33)  is  further  simplified  by  decomposing,  71  =  7ir+J7ii-  In  which  case,  we  notice  that  (33)  is  quadratic 
in  7lr.,7li,and  purely  linear  in  A2.  The  Lagrange  dual  optimization  is  therefore. 


max  'H(7ir,7ii>72) 

7lT-,7li>72 

s.  t  72  >  0.  (34) 


Maximizing  first  w.r.t.  7ir,7ij,  we  have  the  solutions. 


Substituting  the  above  solutions  into  (33),  the  Lagrange  dual  optimization  problem  and  after  ignoring  an  unnecessary 


15 

Approved  for  public  release;  distribution  unlimited. 


P.  SETLUR  AND  M.  RANGASWAMY:  AFRL  SENSORS  DIRECTORATE  TECH.  REPORT.  2014. 


16 


additive  constant,  takes  the  form. 


Wf  —1  —1 

max  k2  Zq(w)  +  72Ijv)  GHw)  -72 P0 


s.  t.  72  >  0 


(35) 


The  associated  Lagrangian  for  (35)  is 


£>(72,7) 


wHGF-1GHw 


72  Po  -  7372 


(36) 


Q 

where  F  :=  ^  Zq(w)  +  72 1 .v-  The  first  order  optimality  condition  for  the  optimization  (35)  is  given  by: 

<2=1 


d 

<972 


KwHGF-1GHw> 


Po  -  73  =  0 


or 


or 


(wflGF_1GHw)2 


(wflGF_1GHw)2 


whG^^Ghw  — P0^73  =0 
01-2 

whG(F-1|^F'1)Ghw  -  PQ  -  73  =  0 
072 


where  73  is  the  Lagrange  multiplier  associated  with  the  Lagrangian  (36),  and  we  also  have  JF  =  The 
complementary  slackness  and  constraint  qualifier  for  (35)  i.e.  7372  =  0  and  72  >  0  form  the  rest  of  the  equations 
comprising  the  KKT  conditions.  It  is  now  readily  shown  that  the  solution  to  (35)  is  given  by 


72  =  max[0, 72]  (37) 

72  solves  72  (k2 wffGF_2Gffw  —  P„(wflGF”1Gffw)2)  =  0. 


Proposition  3.  The  parameter  72  =  0  solves  (37). 

Proof.  The  spectral  theorem  for  Hermitian  matrices,  allows  for  a  decomposition,  F  =  E('A+72l;v)E//.  The  matrix 
A  is  a  diagonal  matrix  comprising  eigenvalues  in  descending  order,  whereas,  E  is  unitary  and  whose  columns  are 
the  corresponding  eigenvectors  of  F.  For  ease  of  exposition,  denote  z  <G  <CN  :=  E^G^w,  then  assume  a  function 
/( 72)  :  R+  — >  R,  expressed  as 


/(72)  ;=  k2whGF~2Ghw  -  P0(whGF-1Ghw)2 


N 


N 


=  E' 

n= 1 


{dn  +  12)2 


-Po  E 


\n—l 


dn  +  72 


(38) 


where  zn,dn  are  the  n-th  elements  of  z,  and  the  n-th  eigenvalue  in  A.  We  analyze  /( 72)  and  72/(72)  in  detail. 
The  following  (behavior  at  0  and  00)  are  readily  observed 


lim  /( 72)  =  /( 00)  =  0  (39a) 

72— >00 
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lim  /(72)  =  V  i 

72— >-0  L — ' 

n= 1 


-ME 


=  /(0) 


Furthermore,  it  is  seen  that 


lim  72/(72)  =  Hm  —j^~  =  lim  ^72  2  =  0 

■'  72-^00  1/72  72-^00  (—1/72) 


N  f2,  x  JV 

Moreover,  consider  /(72)  =  ^1(72)  -  ^2(72)  =  0,  where  hi (72)  =  «2  E  ^^,^2(72)  =  ^o(  E  /n( 72))2, 

n=l  n=l 

U  I2 

where  /„( 72)  =  — .  Note  that  /n( 72)  /,  n  =  1,  2, . . . ,  A  and  that  hi (72 )  /,  i  =  1,  2,  i.e.  decreasing  functions 


w.r.t.  72  G  [0, 00).  Then  equation/(72)  =  0  implies  that 


( dn  +  72)2 


-Pjy  ■”l  =0 

\t^1dn  +  l2  ) 


OT  E7  12  Po)fn{ 72)  ^  /m(72)/n2  (72) 


n=l  11  ni  "2 

n2^n1 

where  (m,  n2)  G  (1,2,...,  A).  Recall  that  dn  7^  OVn,  dn  >  dn+i,  n  =  1,  2, . . . ,  A,  and  |zn|  7^  OVn.  A  solution  to 
(41)  for  72  G  [0, 00)  is  readily  derived  in  the  trivial  case,  for  example  when  /„,  (72)  =  /„2  (72),  -P0  7^  k2,  and  for 
|2n|  to  be  some  arbitrary  constant  for  all  n.  For  PG  >  k  it  may  now  be  shown  numerically  that  a  solution  to  (41) 
for  72  G  [0, 00)  does  not  exist.  □ 


In  fact,  our  extensive  numerical  simulations  reveal  that  in  general  and  assuming  P0>  n  and  for  721  <  722 


( /(721)  >  /( 722)  if  /( 0)  >0 

<  721  and  722  G  [0,  00). 

[/(721)  <  / (722 )  if  /(0)  <  0 


(42) 


That  is,  /(72)  is  monotonic.  From  the  above  arguments,  therefore,  72/(72)  =  0  implies  that  72  =  0.  Alternatively 
nevertheless,  a  solution  to  (37)  may  be  found  numerically  and  is  computationally  cheap. 

Note:  (Inactive  power  constraint)  It  is  noted  that  trivially  72  =  0  may  always  be  chosen  as  a  solution  with 
suitable  choices  of  the  free  parameter  PQ.  This  implies  that  the  power  constraint  is  always  satisfied  and  hence  is 
an  inactive  constraint  in  the  corresponding  Lagrangian. 

A  graphical  behavior  of  hi( 72),*  =  1,2  and  thus  the  behavior  of  /(7 2)  is  seen  from  Fig.  4.  Using  Prop.  3,  the 
waveform  design  solution  is  unique,  a  function  of  w  and  expressed  as, 

«(EZq(W)) 

s0(w)  =  - 9-1q - — - •  (43) 

wHGfEzq(w))  GHw 

V=i  2 

Note:  ( Strong  convexity)  To  compute  the  constrained  alternating  minimization  solutions,  the  respective  matrices 
in  (43),  (18)  must  be  invertible,  implying  strong  convexity  individually  w.r.t.  w,  s,  respectively.  This  directly 
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Fig.  4:  Two  cases  are  presented  assuming  P0  >  k.  (a)  Blue:  hxffz),  Red:  11.2(72)  and  therefore  /( 72)  is  decreasing,  (b)  Blue: 
^2(72),  Red:  /ii(72)  and  therefore  /( 72)  is  increasing.  The  blue  and  red  curves  intersect  at  00. 


necessitates,  Aminf  Zq(w)^  f  0  and  Amin(Ru(s))  f  0,  and  hence  also,  positive  definiteness  of  these  matrices. 

\=i  ‘ 

The  alternating  minimization  algorithm  is  now  succinctly  stated  in  Table  I. 


Remark  3.  ( Strong  duality)  The  optimal  value  of  the  lagrange  dual  problem  is  given  by 

wff(Ri  +  R„)w  H - - — - - — - . 

X]  Zq(w))  GHw 

It  is  therefore  trivial  to  show  that  the  duality  gap  between  (27)  and  (34)  is  zero.  In  other  words,  strong  duality 

holds  between  the  primal  in  (27)  and  the  dual  in  (34).  From  Slaters  condition  [60]  the  sufficient  condition  to  ensure 

Q 

strong  duality  is  the  existence  of  (43),  i.e.  the  inverse  of  ^  Zq(w)  exists  (see  note  below),  and  that  the  solution 

g=i 


in  (43)  satisfies  the  power  constraint. 


Q  Q 

Note:  ( Lower  bound  on  Q )  Since  rank(  Zq(w))  <  rank(Zq(w)),  assume  the  worst  case  P  =  1,  then  we 

9=1  9=1 

have  that  rank(Zq)  =  1.  Therefore  for  (,)  distinct  (different  spatial  signature  and  Doppler)  clutter  patches,  Q  >  N 

ensures  invertibility  of  Zq. 

1 

1)  Convergence,  performance  guarantees,  and  other  properties:  Denote  (wfc,s k)  as  the  sequence  of 
iterates  of  the  algorithm  in  Table  I  and  define  g( w^,  s^)  :=  w/,.: IT11(s/,:)w/,/,  then  for  k  =  1,2,... 


•  ■■  g( Wfc,sfe_i)  >  g{ Wfc,sfe)  >  g{ wfc+i,sfc) 


(44) 


Moreover,  since  at  least  Ru(s)  7  0,  i.e.  PSD  Vs,  we  have  that  g( w,  s)  >  0,Vw.  Therefore  each  of  the  individual 
terms  in  (44)  are  lower  bounded  by  zero,  in  other  words  g( Wk1 ,  Sfc2)  >  0,  k\  =  k,  or  k  +  1  and  =  k,  or  k  +  1, 
for  fc  =  1,2,...  . 

Proposition  4.  Iff  the  iterates  (w^Sfc)  of  the  constrained  alternating  minimization  exist,  then  lim  g(wR,Sk)  is 

k— >00 
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TABLE  I:  Constrained  alternating  minimization  for  waveform  adaptive  radar  STAP 


1)  Initialize :  Start  with  an  initial  waveform  design, 
defined  as  s„°\  set  counter  k  =  I 

2)  Filter  design :  Design  the  optimal  filter  weight 
vector,  =  wD(s ok  ^),  where  (18)  is  used 
to  compute  wG(-). 

3)  Waveform  design :  Design  the  updated  waveform 
$o''  =  s„(wf*),  where  (43)  is  used  to  compute 
so(-)- 

4)  Check:  If  convergence  is  achieved,  exit,  else  k  = 
k  +  1,  go  back  to  step-2. 


finite. 

Proof.  The  non-increasing  property  in  (44),  and  since  each  term  in  (44)  is  lower  bounded,  straightforward  application 
of  the  monotone  convergence  theorem  to  the  sequence,  {g(wk,  Sr)}, completes  the  proof.  □ 

We  note  that  convergence  to  a  finite  limit  as  evidenced  from  Prop.  4  is  indeed  dependent  on  the  constraints  via 
the  existence  of  the  iterates  (wfc,Sfc).  This  however  does  not  imply  convergence  of  the  sequence  {(wfc,Sfc)},  for 
which,  consider  the  following. 

Remark  4.  The  alternating  minimization  is  a  special  case  of  the  block  Gauss-Siedel  and  block  co-ordinate  descent 
(BCD)  algorithm  with  block  size  equal  to  two  [12],  [15]. 

Definition  2.  ( Convergence  in  E^)  A  sequence  { x/. }  e  R  v ,  k  =  1,  2, ...  is  said  to  converge  to  x,  a  limit  point, 
if,  Ve  >  0,  3 K  €  N  :  ||xfc  —  x||  <  e,  k  >  K. 

Lemma  3.  ( Constrained  alternating  minimization  lemma )  Assume  that  a  function  g{ z)  :  R.2Ar  — »  R,  z  =  [xT,  yT]T 

is  continuously  differentiable  over  a  closed  nonempty  convex  set,  A  =  Ai  X  A2-  Also,  suppose  the  solution  to  the 

constrained  optimization  problems,  min  g(x,  y)  and  min  r/(x,  y)  are  uniquely  attained.  Let  {z^.}  be  the  sequence 

xeAi  y&A2 

generated  by  this  algorithm,  then  every  limit  point  of  this  sequence  is  also  a  stationary  point. 

Proof.  The  proof  in  [14,  Prop.  2.7.1]  follows  immediately  to  the  alternating  minimization  assuming  two  blocks. 
Also  see  [15],  where  the  convergence  of  the  two  block  BCD  was  analyzed.  □ 

The  above  Lem.  3  discusses  convergence  of  the  constrained  alternating  minimization. This  lemma  can  be  applied 
by  decomposing  our  problem  into  its  real  equivalent  along-with  real  and  imaginary  decomposition  of  w,  s,  and 
assuming  the  our  constraint  set  A  =  A\  x  A2  is  closed  convex  and  the  minimizers  are  unique.  The  necessary 
condition  of  a  unique  minimizer  [10]  at  each  step  is  not  obvious,  but  [9]  showed  that  in  the  absence  of  this 
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assumption  the  algorithm  cycles  endlessly  around  a  particular  objective  value  [14].  Further  the  algorithm  provides 
limit  points  which  are  not  stationary  points  [15].  To  discuss  the  characteristics  of  the  limits  points  at  convergence, 
consider  the  remark,  presented  next. 

Remark  5.  ( Characterizing  the  solutions  at  convergence )  If  (w*,  s*)  are  the  limit  points  of  the  sequence  {(wfc,  s*;)}. 
Then,  (w*,  s*)  is  a  local  minima,  i.e.  by  definition  g( w*,  s*)  <  g{ w,  s),3e  >  0  with  (w,  s)  :  \/||w  —  w*||2  +  [js  —  s*||2 
e.  Further,  (w*,s*)  :  g( w*,s*)  <  5(w*,s),  Vs  G  A2  and  5f(w*,s*)  <  g( w,s*),  Vw  G  Ai. 

The  first  statement  in  Rem.  5  directly  results  from  from  the  stationarity  condition  as  given  in  Lem.  3  and  also 
since  the  objective  is  non-convex.  The  second  statement  in  Rem.  5  arises  from  the  individual  convexity  in  w 
and  s  as  shown  in  Prop.  1,  Prop.  2.  We  note  readily  from  Rem.  1,  that  unfortunately  there  is  nothing  special  or 
strong  about  (w*,s*)  except  the  fact  that  they  are  local  minima.  It  is  well  known  that  global  extrema  (minima 
or  maxima)  are  attained  only  when  the  objective  is  either  convex  or  concave.  For  a  problem  similar  to  ours  and 
where  the  alternating  minimization  was  applied,  see  [33,  pg.3537]  the  authors  state  that  their  algorithm  produces 
limit  points  which  are  stronger  than  local  maxima,  in  our  opinion  this  conclusion  is  suspect.  They  further  claim 
that  their  algorithm  produces  global  extrema  in  their  filter  design  and  waveform  dimensions  individually,  which 
leads  us  to  believe  that  their  objective  is  concave,  although  this  was  never  proved  in  [33].  In  our  opinion,  Rem.  1  is 
also  relevant  to  their  objective  by  replacing  minima  by  maxima,  and  hence  we  do  not  believe  that  the  limit  points 
produced  by  their  algorithm  are  stronger  than  local  extrema. 

To  derive  the  upper  and  lower  bounds  on  g{ Wfc,  s^)  —  g( Wk+i,  s^),  the  following  well  known  lemmas  are  useful. 

Lemma  4.  For  any  Hermitian  matrix,  A  G  <F,NxN  and  any  arbitrary  vector  x  G  <FNxN  t  we  always  have 
Amin(A)||x||2  <  xffAx  <  Amax(A)||x||2,  where  Amin(A)  and  Amax(A)  are  the  min.  and  max.  eigenvalues  of 
matrix  A,  respectively. 

Proof.  The  proof  can  be  seen  in  [58],  and  is  in  fact  fundamental  to  the  Rayleigh-Ritz  theorem.  □ 

Lemma  5.  For  any  two  Hermitian  matrices,  A,B,  both  in  <Dffxiv, 

N  N 

^  Aj(A)Ajv-i+t(B)  <  Tr{AB}  <  ^Ai(A)Ai(B) 

i=l  i= 1 

where  A »(■)  >  Ai+i(-),  i-  1.2,, - V. 

Proof.  See  [61,  Lemma.  II.  I]  for  a  proof.  □ 

Consider  g( Wfc,Sfc),  we  have 

g{  Wfc,sfc)  =  wfRu(sfc)wfc 

=  (wfc  -  Wfe+1  +  Wfc+i)ffRu(sfc)(wfc  -  Wfc+1  +  wfc+i) 

=  (wfc  -  wfe+i)HRu(sfc)(wfc  -  wfc+i)  (45) 
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+  wf+1Ru(sfc)wfc+i  +  Re{(wfc  -  wfc+i)ffRu(sfc)wfc+i} 

Moreover  since  the  square  root  decomposition  exists  i.e.,  Ru(-)  -  rJ/2(-)Ru  2  (■),  then  application  of  the  Cauchy- 
Schwartz  inequality  produces, 

Re{(wfc  -  wfc+i)HRu(sfc)wfc+i}  <  (46) 

\J { Wk  -  wfc+i)HRu(sfc)(wfc  -  wfc+i)y/wf+1Ru(sfc)wfc+i 

Using  (46)  in  (45)  and  since  Ru(-)  is  PSD,  we  can  show  that  g(wfc,  s^)— <7(w/,.+i,  s^)  <  (w^— Wfc+i)'ffRu(sfe)(wfc  — 
Wfe+i).  Further  using  (44),  we  have  the  following  upper  and  lower  bounds 

0  <  g(wfc,sfc)  -  g( wfc+i,sfc) 

<  (wfc  -  wfc+i)'ffRu(sfe)(wfc  -  wfc+i)  (47) 

We  notice  immediately,  that  at  convergence  (w*  —  Wfc+i)ff Ru(sfc)(wfc  —  w^+i)  0  since  — ►  w^+i.  Other 

bounds  as  in  (47)  can  be  readily  derived.  From  Lem.  4,  we  can  show  that 

L  '^min(Ru(Sfc))|  |Wfc|  |  ^maxt^u  (Sfc))  1 1  wfc+l  1 | 

g{ Wfc,  Sfc)  -  ff(wfc+i,sfc)  (48) 

5;  ^max(R'u(Sfc))  1 1  Wfc  1 1  Amin  (Ru  (Sfc))  1 1  Wk_|_i  1 1 "  . 


Consider  the  following. 

Lemma  6.  If  x,  y  are  arbitrary  but  distinct  complex  vectors  of  size  N  and  let  A  :=  xxH  —yyH,  then,  (a)  matrix 
A  has  exactly  two  real  non-zero  eigenvalues,  the  rest  N  —  2  eigenvalues  are  all  zeros,  (b)  of  the  two  real  and 
non-zero  eigenvalues  one  is  always  positive  and  the  other  is  always  negative,  and  (c)  if  the  x,  y  are  not  distinct, 
i.e.  y  =  fix,  /?  £  (D,  then  there  exists  only  one  non-zero  eigenvalue,  (|1  —  |/?|2|)||x||2ara/  the  rest  N  —  1  eigenvalues 
are  purely  zeroes. 

Proof.  First  of  all  we  notice  A  is  Hermitian  and  hence  its  eigenvalues  are  real.  The  proof  for  (a)  is  obvious  given 
the  fact  that  A  is  a  sum  of  two  distinct  outer  products.  In  other  words,  rank(A)  =  2,  for  all  y  f=-  fix  . 

Now  we  know  that 


Tr{A}  =  Ai  +  A2  =  xHx  -  yHy 
Tr{AAH}  =  A2  +  A2  =  ||x||4  +  ||y||4  -  2|xffy|2 

where  Xj,i  =  1,2  are  the  two  non  zero  eigenvalues  of  A.  The  above  set  of  equations  can  be  reduced  to  a  quadratic 
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in  any  one  eigenvalue.  It  can  be  shown  that  the  only  two  possible  solutions  are  then 


Ai  = 


Ao  — 


Ixll2-"”"2 


|x||2-||y||2 


(49) 


Since  A, ,  i  =  1,2  are  purely  real  we  have,  1  —  4  ^ x ^ |  ^ 1 1 2  |^y 1 1 2  >  0  and  from  Cauchy  Schwarz  inequality,  we  also 

have  that  Ix^yl2  —  ||x  ll2l|y||2<o.  Using  these  two  facts,  consider  two  specific  cases,  both  of  which  are  shown 
easily  from  elementary  algebra, 

|ai>0,A2<0,  if  ||x||2  -  | |y| |2  >  0  ^ 

|ai<0,A2>0,  if  Ilx||2  | |y 1 12  <  0 

When  ||x||2  —  ||y||2  =  0,  it  is  easily  seen  that  Ai  =  -\/llxll2l|y||2  —  \xHy\2  >  0,  A2  =  — Ai  <  0.  We  also  note 
immediately  from  (49)  that  when,  y  =  /3x,  Ai  =  (1  —  |/)|2)||x||2,  A2  =  0.  This  completes  the  proof.  □ 

It  is  readily  shown  that  g(wk ,  sfc)  —  g(wfc+i>  sk)  =  Tr{Ru(sfc)(wfcWj^  —  W(,+1w^+1)}.  Therefore,  from  Lem.  5, 
and  Lem.  6,  we  have. 


<  Amax(Ru(sfc))A_(wfcwf  -  wfc+iwf+1) 

+  A  min  (Ru(sfc))A+(wfcwf  -  wfe+iwf+1) 

5(wfc,Sfc) -5(wfc+i,sfc)  (51) 

<  Amax(Ru(sfc))A+(wfewf  -  wfc+iwf+1) 

Amin  (Ru(sfc))A_(wfcwf  -  wfc+1wf+1) 


It  is  not  immediately  evident  from  the  analysis  which  set  of  bounds  in  (47),  (48),  (51)  are  tight,  hence  combining 
them  we  have 

!9ib (Ru (sfc) ,  wfc ,  wfc+i ) ,  gfb( Ru (sfc) , wfc ,  wfc+i ) , 

fffb (Ru (Sfc ),  Wfe,  Wfc+i) 

<  5(wfc,sfc)  -  5(wA:+l,Sfc)  < 

{alb (Ru (Sfc ) , Wfc , wfe+i ) ,  glb (Ru (sfc ) , wfc , wfc+i ) , 

3«b(Ru(sfe),wfc,wfe+1) 


where  ^b(Ru(sfc),  wfe, wfc+i),  glub( Ru(sfc),  wfc,  wfc+i),  i  =  1,2,3  are  the  lower  and  upper  bounds  as  given  in 
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(47)-(48),  (51),  for  i  =  1,2,3,  respectively. 

Similar  upper  and  lower  bounds  can  be  readily  derived  for  the  other  corresponding  terms,  r/(wfc+1,Sfc)  — 
c/(wfc+i ,  Sfc+i)  using  analysis  presented  thus  far,  and  is  not  the  focus  now.  Let  us  however  denote  these  corresponding 
lower  and  upper  bounds  to  be  /ijb(Ru(sfe),  wfc,  wfc+i),  hlub(Ru{ sfc),  wfc,  wfc+i),  i  =  1,2,3. 


C.  Constrained  proximal  alternating  minimization 

The  proximal  version  of  the  constrained  alternating  minimization  is  iterative,  and  for  the  filter  design  step, 
optimizes  at  the  fc-th  iteration, 

min  wHRu(sfc_i)w  +  ^-i||w  -  wfc_i||2  (52) 

w  Z 

s.  t  wH(v(/d)  <g>Sfe_i  ®a(0t,(/>())  =  k 

where  a,k- i  G  R+  can  be  seen  as  a  weight  attached  to  the  regularizer  /  penalizer  j|w  —  Wfc_i||2.  This  parameter 
can  be  interpreted  as  follows,  if  it  is  small,  it  encourages  the  optimizer  to  look  for  viable  solutions  in  the  vicinity 
of  However,  if  large,  it  penalizes  the  optimizer  heavily  for  focusing  even  slightly  in  the  immediate  vicinity 

of  wfe_i. 

In  a  similar  spirit,  the  proximal  version  of  the  constrained  alternating  minimization  for  the  waveform  design  step 
at  the  fc-th  iteration  optimizes, 

min  wfRu(s)wfe  + -  Sfc_i||2 

S  Z 

s.  t.  wf  (v(/d)  <g>s<g>a(0t,  (/>*))  =  k  (53) 

sHs  <  Pa 

where  0k-i  G  1R+  is  the  weight  attached  to  the  regularizer  ||s  —  s^_ 1 1 1 2 .  Bounds  on  ctfc-i,  fik-\  relating  it  to  the 
Lipschitz  constants  are  deferred  to  forthcoming  analysis.  A  graphical  example  comparing  the  constrained  alternating 
minimization  and  the  proximal  constrained  alternating  minimization  is  shown  in  Fig.  5. 

Remark  6.  The  objective  functions  in  (52),  (53)  are  still  individually  convex  in  w,  s,  respectively.  The  regularizer 
terms  ||w— wa,_i||2  and  ||s— Sfc__i||2  are  strongly  convex,  and  V^,(||w— Wfc_i||2)  =  1^0,  V2(||s— Sfc_i||2)  =  I  >- 
0,  and  therefore  do  not  alter  the  individual  convexity  of  wffRu(sfc_i)w  and  w^Ru(s)wfc,  w.r.t.  w,  s,  respectively. 

The  solutions  to  (52),  (53)  can  be  cast  in  terms  of  the  proximal  operator  as 

wfc  =prox(ctfe_i>w)  (g(  w,  sfc_i);  wfc_i)  (54) 

s.  t  wH(v(/d)  (8»  Sfc_i  (g)  a (9t,  (t>t))  =  k 


Sfc  =prox(/3fc_liS)(5(wfc,s);sfc_i) 


(55) 
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Constrained  alternating  minimization  Constrained  Proximal  alternating  minimization 

Fig.  5:  Constrained  alternating  minimization  (left)  and  proximal  constrained  alternating  minimization  (right).  Iso  level  contours 
(each  point  on  a  curve  has  identical  function  values)  and  constraint  set  in  background  are  shown.  Outer  iso-curves  assume  higher 
function  values  than  the  inner  iso-curves.  On  right,  and  for  particular  «&,  /3k,  spheres  (dashed,  blue,  dashed  red)  are  the  (two 
of  the  several)  spheres  of  influence  of  the  regularizer.  Outer  spheres  penalize  more  than  the  inner. 


s.  t.  wf  (v(/d)  ®s(g>a(0t,  </)())  =  K 
sHs  <  P0 

where,  for  a  general  /(x)  :  <CN  — >  R,  the  proximal  operator  is  defined  as 

Prox(a,x)(/(x);y)  :=argmin  f(x)  +  -||x  -  y||2.  (56) 

The  proximal  operator  has  a  rich  history  in  the  literature,  and  well  documented  properties,  see  for  example  [20]— [22], 
[24],  A  useful  and  interesting  fact  of  this  operator  is  that  iff  xD  minimizes  /(x)  then  xQ  =  prox(a  x)(/(x);xQ),  a 
proof  is  seen  in  [24], 

Trust  region  interpretation.  The  objective  now  is  to  relate  the  unconstrained  proximal  minimization  as  in  (56) 
to  a  well  known  technique  in  numerical  optimization.  A  generalized  trust  region  subproblem  can  be  formulated  for 
f(x)  :  <CN  R  [62] 


min  /(x) 

X 

s.  t.  ||Ux  —  v 1 1 2  <  <5  (57) 

where  U,  v  are  a  general  nonsingular  matrix,  and  a  vector,  both  characterizing  the  trust  region.  The  positive  scalar 
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5  may  be  interpreted  as  a  parameter  which  specifies  the  extent  of  the  trust  region.  For  U  =  I  and  v  =  y,  the 
proximal  minimization  as  in  (56)  and  the  trust  region  problem  in  (57)  are  equivalent  for  specific  values  of  a  and 
5.  In  particular  every  solution  of  (56)  is  a  solution  to  (57)  for  a  particular  6.  In  the  same  spirit,  every  solution  to 
(57)  is  an  unconstrained  minimizer  to  /(•)  or  a  solution  to  (56)  for  a  particular  a,  see  also  [22],  [24], 

The  proximal  optimizations  problems,  (52),  (53)  can  be  cast  as  equivalent  constrained  trust  region  subproblems, 
where  for  the  k- th  iteration,  the  trust  region  is  characterized  by  the  previous  iteration,  Wfc_i,  i ,  respectively. 
Closed  form:  A  closed  form  solution  to  (52)  is  readily  derived,  expressed  as  in  (58) 

wfc  =  (Ru(sfc_i)  +  -  y  (v(/d)  ®  sfe_i  <g>  a(0t,  fa))) 

afc_iwf__1(Ru(sfc_i)  +  CV^I)  1  (v(/d)  ®  Sfc_i  ®  a(0t,  fa))  —  2k  (58) 

(v(/d)  ®  sfc_i  g)  a(0t,  fa))  H(Ru(sfc_i)  +  1(v(/d)  ®sfc_i  <g>  a(6>t,  <^t)) 

where  74  is  the  Lagrange  parameter  associated  with  (52).  The  solution  to  (53)  is  also  in  closed  form  and  the 
procedure  to  obtain  it  is  similar  to  that  used  in  deriving  (43).  Assuming  that  the  Lagrange  parameters  for  (53)  are 
75  =  l5r  +  jj5i,  76  G  R+,  the  solution  is  expressed  in  (59), 

Sfc  =  (  ^  Zq(Wfc)  +  ^-1  +  76l)-1(^iSfe_i  -  |g%)  (59) 

9=1 

where, 

Rc  jwf  G(  X]  Zq(wfc)  +  ^yLI  +  76l)_1sfc_i|  -  k 

75  r  =  2  Q 

wfG(  E  zq(wfc)  +  %iI  +  76l)_1Gffwfc 

9=  1 

/3fc_iIm|wfG(  X]  Zq(wfc)  +  ^-I  +  76l)  Efc.ij 

75*  =  Q  ■ 

wfG(  E  Zq(wfc)  +  %iI  +  76l)_1Gffwfc 
9=1 

The  Lagrange  parameter  76  is  obtained  by  solving,  the  following 

76^(76)  =  0,  76  >  0  (60) 

obtained  from  the  complementary  slackness  constraint  on  the  Lagrange  dual  and  where, 

i’(7e)  =  ( Po  -  — fw^G(  Zq(wfc)  +  ^^-1  +  76l)_1GHwfc"j 

'  q= 1  ' 

~‘2(bi<cfy6  +  +  yy^  +  Tel)  1GHwk 

Q  n 

~{b2i  +  {br  -  K)2)wfG(^Zq(wfc)  +  ^I  +  76l)'2G%. 

9=1 
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Where  we  also  define 

ak  =sf_1(^Zq(wfc)  +  ^^-I  +  76l)_1sfc_i 

9=1 

W  =  ^yiRe|wfG(^]zq(wfc)  +  ^y^I  +  76l)~1sfc_i  j 

h  =  ^y1Im|wfG(^Zq(wfc)  +  ^^-I  +  76l)_1Sfc-i| 

Further,  since  the  derivative,  Re{-},Im{-}  are  all  linear  we  also  have 

^  =  -%^R-e|wfG(^Zq(wfc)  +  ^-I  +  76l)_2sfc_i| 

=  -%±Im|wfG(^]zq(wfe)  +  ^y^I  +  76l)_2sfc_i| . 

Remark  7.  In  general  r( 76)  is  not  monotone  and  there  exist  one  or  more  zero  crossings  excluding  75  =  00. 
However  in  our  extensive  numerical  simulations,  and  assuming  Pa  »  k2 . 70  =  0  solves  (60). 

It  is  readily  seen  that  lim  r( 76)  =  r(0)  0,  lim  r( 76)  =  0,  lim  'yef’i'je)  =  0.  Nevertheless  unlike  Prop.  3, 

76— >-0  76— >-00  76— >00 

Rem.  7  is  not  straightforward  to  demonstrate  analytically,  however  can  be  shown  numerically.  See  Section  IV  for 
some  demonstrative  examples  not  specific  to  the  radar  problem. 

The  value  of  jq  =  0  is  substituted  in  (59)  to  obtain  the  final  waveform  solution  s/. (■). 

Remark  8.  ( Strong  duality )  The  primal  problem,  (53)  and  its  associated  dual  have  zero  duality  gap.  This  is 
straightforward  but  tedious  to  show.  However  we  provide  the  optimal  values  attained  by  the  primal  as  well  as  the 
dual,  given  below, 

Q  o 

Wf  (Ri  +R„)Wfc  +S^(^Zq(Wfc)  + 

9=1 

+  - Pk-  iRe{s^sfc_i}  (61) 

where  using  (59),  Prop.  7, 

*  /V"'  rj  (  \  ,  /3fc-lT,-i//3fc-l  7-5  \ 

sfe  =  (z^Z q(wfc)  +  ^)— X)  sfc-  Wfc) 

9=1 

/3fe_iwf  G(  X]  Zq(wfc)  +  ^=±I)~1sk_1  -  2k 
9=1 

75  = - Q - ' 

WfG(E  Zq(Wfc)  +  V^-'GH 

9=1 

This  is  not  surprising  since  it  is  similar  to  Rem.  3.  However,  in  this  case  the  condition  on  the  existence  of  the 
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matrix  is  irrelevant,  since  the  inverse  in  (59)  always  exists.  Hence  Slater’s  condition  now  is  a  simple  constraint 
qualifier  (the  power  constraint)  which  must  be  satisfied  as  in  Rem.  3. 

Interpretation  with  specific  ranges  of  oik-u  Pk-i  and  related  to  the  Lipschitz  constants.  Some  definitions 
and  lemmas  are  useful  for  future  discussions  and  are  expressed  below 

Definition  3.  (Lipschitz  continuous  gradient )  A  function  /(x)  :  R.jY  — >  R  has  a  Lipschitz  constant  (and  trivially 
real  positive),  L,  when  ||Vx/(x)  —  Vy/(y)||  <  L||x  —  y||,  and  Vx,  y  G  ]RN. 

Note:  ( upper  bound  on  Hessian  )  If  ,/(x)  has  a  Lipschitz  continuous  gradient,  with  constant  L,  then  using  Taylor’s 
theorem,  it  can  be  proved  that  V|/(x)  LI. 

Remark  9.  The  Lipschitz  constant  for  /(x)  =  x7  Bx  is  the  maximum  eigenvalue  of  B,  i.e.  Amax(B),  where 

B  €  1RNxN,  x  £  Rw. 

This  is  readily  seen  since  VxxTBx  =  Bx.  Further  since  the  induced  (by  an  arbitrary  z  £  R N)  spectral  norm 
(notation:  |||  •  |||)  is  defined  as 

_  I  I T3  y  I  I  _  / - - - - - 

1 1 B 1 1  :=  sup{  :  z  £  W(N ,  z  f  0},  |Bz|  |  =  v  zTBTBz 

z  |  |z|  | 

but  we  know  from  Lem.  4  that  zTBz  <  Amax(B)||z||2  and  that  eigenvalues  of  B  and  BT  are  identical.  This  further 
implies  that  zTBTBz  <  A^lax(B)||z||2.  Therefore  from  Definition  3,  it  is  readily  seen  that  the  Lipschitz  constant 
is  the  maximum  eigenvalue  of  B. 

Lemma  7.  (Descent  lemma )  If  /(x)  :  Rw  — >  R  is  continuously  differentiable  and  has  a  Lipschitz  continuous 
gradient  described  by  constant  L,  then  f(5c  )</(y)  +  Vy/(y)T(x-y)  +  |||x-y||2. 


Proof.  See  [14,  Prop.  A. 24]  and  also  [17,  Lem2.2]  relevant  in  general  for  the  BCD.  □ 

Consider  an  arbitrary  g(x)  :=  xffBx,  and  B  =  BH,  x  £  <CN.  Since  g(x)  :  (D^  — ►  R,  a  real  equivalent  of  g(x) 
could  be  defined  as  g(x)  :=  x7  Bx  where 


B  := 


Re{B}  — Im{B} 
Im{B}  Re{B} 


£  R 


2N  X  2N  .n. 


,  x  =  [Re{x}7Im{x}7]7  G  R.27V. 


Lemma  8.  The  matrix  B  := 


Rc{B}  — Im{B} 
Im{B}  Rc{B} 


G  jf2Nx2N  and  [q  b*]  ^  ® 2Nx2N  have  identical  eigenvalues, 
A i,i  =  1,2,...,  2N.  Moreover,  if  B  is  Hermitian,  then  A i  £  R+,  i  =  1,2, ,  2 N  are  equal  to  twice  the  multiplicity 
of  the  eigenvalues  of  B  G  <f,NxN_ 


Proof.  Owing  to  the  complex  to  real-real  isomorphism,  it  can  be  shown  after  algebraic  manipulations  that 


B 

0 

=  PffBP, 

P  —  — 

jl 

I 

0 

B* 

a/2 

I 

1 

1— 1 

pff  _  p-i 


(62) 
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That  is  (62)  indicates  that  B  and  [  q  g*  ]  are  unitary  equivalent.  Therefore  they  share  the  same  eigenvalues. 
Furthermore  if  B  is  Hermitian  its  eigenvalues  are  purely  real,  and  hence  trivially,  the  eigenvalues  of  B,  B*  are 
identical,  and  their  eigenvectors  are  complex  conjugates  of  one  another.  Hence  the  block  diagonal  matrix  has 
identical  eigenvalues  as  B  but  with  multiplicity  two.  □ 


Consider  the  objective  in  (52),  (53).  Define  g( w,  Sk-i),g(yvk-i,sk-i)  as  the  real  equivalents  of  g(w,  sk-i),  g(wk- 
respectively  for  the  filter  design  objective  as  in  (52).  In  addition,  denote  L|  as  the  Lipschitz  constant  associated 
with  gi'Wk-i,  Sfc_i).  Similarly  using  the  same  notation  and  for  the  objective  in  the  waveform  design  objective  as 
in  (53)  consider  the  real  equivalents,  g(s,  w^),  g(sk-i,  w*)  and  the  Lipschitz  constant  denoted  as  L2fc_i.  Then  the 
following  inequalities  can  now  be  shown. 


sO)  +  ^p||wfc_i  -  w||2  >  p(wfc_!) 
+Vg(wfc_i)T(w  -  wfc_i)  +  Ll^~1  ||wfc_i  -  w||2  >  g( w) 


(63) 


9{ s)  +  L2^llsfc-i  -  s||2  >  g{ Sfe-i) 

+Vg(sfc_i)T(s  -  Ifc_i)  +  I'2^1  ||sfc_i  -  s||2  >  g( s) 


(64) 


where  in  (63),  the  known’s  s^-i  and  in  (64),  the  known’s  w/,  are  respectively  treated  as  constants,  therefore 
suppressed  in  notation  for  brevity.  We  further  note  that  (63),  (64)  are  tight,  i.e.  for  w/.  =  wk-i,  sk  =  &k- 1  the 
inequalities  are  strict  equality’s.  The  Lipschitz  constants,  Lik-i,  L2A  —  i  are  readily  derived  using  Lem.  8. 


Remark  10.  It  is  readily  seen  that  if  ak-i  >  ^ik-i  and  B'2k-i  >  i  the  inequalities  in  (63),  (64)  are  valid  by 
replacing  Life_i,L2fc_i  with  ak-i,/3k-i,  respectively. 


The  term  in  the  first  inequalities  of  (63),  (64)  are  the  proximal  minimization  objectives  with  ak- 1  =  ’L\k-h  Pk-i  = 
L2fc_i.  The  inequalities  of  (63),  (64)  are  obtained  from  first  applying  the  convexity  Def.  1(b)  (first  order  definition) 
and  then  subsequently  adding  the  respective  terms  Ll^~1 1 |wfc_i  —  w||2,  L2k2~1  ||Ife_i  —  s||2  and  then  using  Lem.  7, 
the  descent  lemma. 

Additionally,  it  is  recalled  that  the  functions  associated  with  the  second  inequalities  of  (63),  (64)  are  the 
(unconstrained)  objectives  which  are  minimized  by  the  gradient  descent  with  step  size  Lifc_i,  L-jk-i .  respectively. 
That  is,  the  new  iterations  are  then  wk  =  w k-i  —  j~~: 'Vwj(w),  and  sk  =  sk-i  —  L  1  V§g(s).  Therefore  from 

(63),  (64)  and  Rem.  10  we  note  that  the  proximal  objective,  the  gradient  descent  objective  are  all  surrogate  albeit 
tight  upper  bounds  on  the  true  objective  \/ctk-i  >  Lifc_i  and  \/(3k- 1  >  L2fc_i.  This  interpretation  is  graphically 
depicted  in  Fig.  6  for  the  filter  design  objective  as  in  (52)  but  for  ctk-i  —  Li .  A  similar  graphic  interpretation 
is  obvious  for  the  waveform  design  stage  and  is  therefore  not  shown. 

Tikhonov  interpretation  This  interpretation  is  immediate  from  (58),  (59).  In  fact  from  (52),  (53),  the  quadratic 
regularizers  ||w  —  wfc_  1 1 1 2 , 1 1 s  —  Sfc_i||2  are  essentially  Tikhonov  regularization  terms.  Geometrically  they  are 
spheres  centered  at  Wfc_i,  Sfc_i  and  encourage  the  current  iterates  to  be  in  the  vicinity  of  the  previous  iterates. 


,  Sfc-l), 
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Furthermore,  since  in  the  limit,  the  regularizer  terms  only  decrease,  this  may  be  also  seen  as  a  vanishing  Tikhonov 
regularization  problem  [24]  for  each  iteration  in  both  the  waveform  and  the  filter  vectors. 

Proximal  minimization:  A  training  data  starved  STAP  solution  The  regularization  in  (52),  (53)  leads  to 
diagonally  loaded  solutions  (58),  (59)  when  compared  to  the  constrained  alternating  minimization  solutions  as  in 
(18)  and  (43).  In  particular,  the  diagonal  loading  serves  two  important  purposes,  firstly  it  offers  a  numerically  stable 
solution  by  conditioning  .  Secondly  and  more  importantly,  it  permits  a  weight  vector  solution  when  rank(Ru(s))  < 
NML. 

Practical  STAP  contends  with  rank  deficient  correlation  matrices  due  to  lack  of  sufficient  training  data  from 
neighboring  range  cells  due  to  outlier  contamination  or  heterogeneity  in  the  data.  The  solution  in  (52)  ameliorates 
over  the  training  data  starved  STAP  scenarios. 

So  far,  we  have  considered  the  algorithms  for  waveform  design  without  enforcing  constraints  such  as  const, 
modulus  or  sidelobe  constraints.  The  minimum  eigenvector  solution  belongs  to  this  class  of  unconstrained  waveform 
design.  We  will  revisit  this  design  by  considering  (19)  and  Lem.  4. 

Remark  11.  The  min.  eigenvector  solution  in  (20)  is  still  optimal  in  the  presence  of  clutter,  provided  Ri  +  Rn 
and  Rc(s)  share  the  same  eigenvector  corresponding  to  their  min. eigenvalues,  but  with  Amin(Rc(s))  =  0,  always. 

This  is  readily  seen  since  the  optimization  in  (19),  ignoring  the  constraint  for  now  could  be  recast  as  max(v(/d)<g) 

S 

s  (gi  a(0t,  <g>  s  (g)  a(0t,  4>t))-  Now  using  Woodbury’s  identity  [63],  we  have 


(Ri  +  Rn  +  Ru(s))  1  —  (Ri  +  Rn)  1 
-(Ri  +  Rn)-1Rc(s)(l  +  (Ri  +  Rn)-1Rc(s))”1(Ri  +  R^’1. 


(65) 


Further  using  the  eigenvector  relations,  (Ri  +  Rn)(v(/d)  0 s0 a(0t,  <j>t))  =  Amin(Ri  +  Rn)(v(/d)  ®  s0a(0t,  <j>t)) 
and  Rc(s)(v(/d)  <gi  s  <g>  a(0t,</>t))  =  Ami„(Rc(s ))(v(/d)  0  s  <g>  a (0t,<j>t))  =  0  in  (65),  it  is  readily  seen  that 
(v(/d)  0  s  0  a (9t,  (Ri  +  Rn  +  R-u(s))_1)(v(/d)  <g>  s  0  a(0t,  </>())  =  A~(n(Rj  +  Rn). 

The  simplest  example  where  Rem.  1 1  is  satisfied  is  when  the  noise  correlation  matrix  is  scaled  identity  (may  not 
be  practical  for  narrowband  radar),  clutter  correlation  matrix  is  low  rank.  In  STAP  and  for  ideal  scenarios,  insights  to 
the  clutter  rank  are  obtained  by  the  Brennan’s  rule  [1] — [3],  A  high  clutter  rank  prevails  due  to  the  practical  effects 
such  as,  the  intrinsic  clutter  motion.velocity  misalignment  and  crabbing,  mutual  coupling  and  antennae  element 
mismatches  as  well  as  clutter  ambiguities  in  Doppler  resulting  in  aliasing  [2], 


D.  Constant  modulus  alternating  minimization 

So  far,  the  optimization  problems  had  no  specific  constraints  (except  the  power/energy  constraint)  on  the  wave¬ 
form,  constant  modulus  is  a  desirable  property  to  have  in  a  waveform  [64].  The  optimum  weight  vector  is  unchanged 
by  introducing  the  const,  modulus  constraint,  and  is  identical  to  (18)  for  the  constrained  alternating  minimization 
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Fig.  6:  Upper  bounds  on  the  objective  for  the  proximal  algorithm  w.r.t.  the  filter  design.  A  similar  graphical  interpretation  for 
the  waveform  design  but  with  L2fc-i  is  also  easy  depicted  but  not  shown  here. 


4 

Since  the  optimization  w.r.t.  weight  vector  is  unchanged,  we  only  treat  the  optimization  for  s  but  with  the  const, 
mod.  constraint  for  a  fixed  but  arbitrary  w,  formulated  below 

min  wffRu(s)w 

S 

s.  t.  wH(v(fd)  (g)  s  ®  a.(9t,4>t))  =  K  (66) 

N  =p,i  =  1,2, 

where  ,s,  is  the  i-th  component  in  s.  Unlike  say  (17),  notice  that  in  (66),  constraining  the  power  of  the  waveform 
is  unnecessary  since  p  is  fixed  but  could  be  chosen  arbitrarily  to  scale  up  /  down  the  waveforms  energy  to  satisfy 
hardware  limitations.  Therefore,  the  last  N  constraints  in  (66)  implicitly  impose  the  power  requirements,  but  more 
importantly  also  impose  the  constant  modulus  constraint. 

The  Lagrangian  of  (66)  is  expressed  as 

£(s>77,75)  =  wffRu(s)w  +  Re{77(wffQs  -  k)} 

4The  analysis  of  the  proximal  constrained  alternating  minimization  with  the  const,  mod.  constraint  is  omitted,  but  can  be 
readily  derived  from  the  analysis  of  its  non-proximal  counterpart,  presented  here. 
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+  sffD7s  -  pi7  78 


(67) 


where  the  Lagrange  parameter,  77  G  (D,  and  the  Lagrange  parameter  vector  7S  =  [78i,7825  •  •  •  )78N]T  G  IR,V 
are  for  the  Capon  constraint  and  the  N  const,  mod.  constraints,  respectively.  Furthermore  in  (67),  define  D7  = 

r7s!  i 


,  i.e.  a  diagonal  matrix.  The  KKT  conditions  are  expressed  as 


78jv  J 


k(  E  Zq(W)  +  Dt) 

s0(w)  = - — ^ -  (68a) 

wffG( E  Zq(w)  +  D7)_1G«w 

9=1 

|s0i(w)|  =p,i  =  1,2,  (68b) 


The  waveform  which  simultaneously  satisfies  (68)(a)(b)  is  the  solution.  Moreover,  note  that  (68)(a)(b)  are  2 N  non¬ 
linear  equations  with  2 N  unknowns.  The  first  N  unknowns  are  sG,(w), i  =  1,2  . . .  ,N  and  the  next  N  unknowns 
are  the  Lagrange  parameters  js,  ■  Unfortunately,  (68)  is  not  in  closed  form  but  can  be  solved  numerically  for  the 
N  parameters,  7s;,i  =  1,2,  ...,7V  via  a  numerical  root  finder.  Nonetheless  we  note  that  'jSi  G  (—00,00)  and  a 
reasonable  initialization  point  is  not  forthcoming  for  the  numerical  root  finding. 

Eliminating  the  constant  modulus  constraints  Instead  of  solving  the  2N  non-linear  equations  as  in  (68)(a)(b), 
we  take  an  alternative  approach.  One  may  reformulate  the  optimization  (66)  by  eliminating  the  last  N  constraints, 
by  imposing  a  structure  on  s,  namely,  .s,;  =  p  expfja,).  Other  structures  exists  but  from  our  experience,  complex 
exponentials  are  the  easiest  to  manipulate.  The  new  optimization  problem  is  now  w.r.t.  a  =  [ai,a2,  ■  ■  ■ ,  (k,:v]T  G 
E*,  expressed  as 


min  wffRu(s)w 

OL 

s.  t.  wH(v(/d)  g)  s  <g>  a(dt,(j)t))  =  k  (69) 

where  in,  s  =  p[exp(jai),  exp(ja2),  •  •  ■ ,  exp(jaAr)]T  and  a,;  G  [0, 27r),i  =  1,2,  ...,1V.  The  Lagrangian  corre¬ 
sponding  to  (69)  is 


C(at,  79)  =  w"Ru(s)w  +  Re{7g(w"Gs  -  k)}. 


(70) 


The  KKT’s  are  expressed  as,  dC(°L’'l9'>  =  0  and  w,;Gs  =  k.  Noting  that  a.  is  purely  real,  we  have 


dC(a,  79) 
da 


Q 


Q 


=  ~3  Y,  Z<is  0  s*  +7  51  Zqs*  0  1 
9= 1  9=1 

+Im{7g(wffG)T  ©  s}  =  0. 


(71) 


Q 

The  above  equation  can  be  simplified  as,  Im{  E  Z*s*  0s—  ^-{wH G)T  ©  s}.  Using  this  in  (71),  and  taking 

9=1 
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the  complex  conjugate,  while  absorbing  the  negative  sign  into  the  constant  79s,  we  have  the  KKTs  in  final  form 
expressed  as 


wffGs„  =  k 


(72a) 

(72b) 


where  0  is  a  column  vector  of  all  zeros  and  of  dimension  N .  The  optimal  solution,  sQ,  is  a  function  of  the  optimal 

a0.  This  relationship  although  evident  from  (69)  is  not  explicitly  stressed  in  (72)  for  notational  succinctness. 

Q 

Define  Zq  :=  Zq(w)  and  let  ztJ .  i  =  1,  2, . . . ,  N,j  =  1, 2  . . . ,  N  be  the  ij- th  element  of  Zq.  Noting  that  Zq 

9=1 

is  Hermitian,  we  also  have  Im {zu}  =  0,  Vi,  Zjt  =  z*j. 


Proposition  5.  The  Lagrange  parameter  79  =  0  solves  (72). 


Proof.  For  any  z  G  ( D,  and  any  9  G  [0,  2n],  we  have  Im{z  exp(j(?)}  =  Re{z}sin(0)  +  Im{z}  cos(0).  Using  this 
and  the  fact  that  Zq  =  Zq,  the  i-th  equation  in  (72)(a)  can  be  simplified  as 

N 

2p(  Re{^'}  sin(«i  -  a°)  +  cos(a°  -  a°)) 

j'=i  VV*  (73) 

=  Im{7gUj  exp(-ja?)},  i  =  1, 2, . . . ,  N 


N 

where  rq  is  the  i-th  element  of  u  =  GHw  Adding  the  N  equations  in  (73),  it  easily  seen  that  Im{7g u-i  exp(— ja°} 

i- 1 


N 

0  but  we  know  from  (72)(b)  that  p  ut  exp(— ja°)  =  n,  where  k£E.  Therefore  this  implies  that  Im{7g}  =  0  or 

i=l 

in  other  words,  79  is  purely  real.  Substituting  this  back  into  (72)(a)  and  following  the  same  arguments  as  before. 


this  is  possible  if  trivially  p  =  0  or  79  =  0,  the  former  is  false  since  p  =  0  does  not  solve  (72)(b),  therefore  the 


latter  must  be  true. 


□ 


Interpretation  of  79  =  0.  With  jg  =  0,  from  (72)(a)  we  have  that 


Im 


=  0 


wffGs„  =  K. 


(74) 


The  first  equation  in  (74)  does  not  depend  on  p,  but  the  second  does.  Therefore  79  =  0  does  not  imply  that  the 
constraint  in  (69)  is  inactive.  Rather,  this  implies  that  the  KKTs  enforce  the  Capon  constraint  in  (69)  for  the  constant 
modulus  waveform  by  varying  the  unspecified  modulus  parameter  p. 

The  result  in  Prop.  5  has  some  very  interesting  consequences.  Using  79  =  0,  the  N  equations  in  (73)  and 


5new  79=old  —79. 
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therefore  (72)(a),  can  be  rewritten  as  a  some  linear  matrix  equation  Zqp„(j  =  0,  where  Zq  G  IRArx(2)  and  the 
vector  pao  =  [sin(«2  —  a?),sin(a3  —  a°) . . .  ,sin(a^  —  cr^-i)’ cos(a2  —  a°), . . .  ,cos(a^  —  ct^r_1)]T  i.e.  has 
(^)  components  consisting  of  sines  and  cosines  of  all  possible  differences  of  a°  —  a°,\/i,\/j  f  i.  In  other  words, 
pQo  G  null(ZQ).  The  rank  of  Zq  is  not  easy  to  calculate  here  but  its  maximum  value  is  N.  Therefore  from  the 
rank-nullity  theorem,  dim(null(ZQ))  >  N(N  —  2).  Clearly  there  could  exist  multiple  vectors  which  are  in  this 
null  space  but  we  are  not  certain  if  this  translates  to  multiple  solutions  of  a0  from  this  linear  equation  alone. 
Nonetheless,  if  multiple  solutions  exist  to  this  linear  equation,  they  must  also  satisfy  (72)(b)  to  be  considered  as  a 
solution  to  (69).  In  any  case  the  optimal  solutionis)  are  in,  Ca °  C  TR iV ,  with 

n 

Ca °  =  {at°  :  pQo  G  null(ZQ),^w*  exp(ja°)  =  (75) 

i—1  P 

It  remains  to  be  seen  if  Ca°  is  singleton,  or  comprises  many  elements,  but  we  are  optimistic  that  it  would  not  turn 
out  to  be  empty. 

E.  Practical  Considerations:  Classical  STAP  v.v  Waveform  adaptive  STAP 

Here  we  addresses  practical  considerations  on  the  fast  time-slow  time  model  in  STAP  which  aids  in  the  waveform 
design  and  compare  this  with  the  classical  model  in  STAP  (slow  time). 

Hardware  The  fast-time  slow-time  model  in  STAP  does  not  necessitate  newer  hardware  nor  does  it  require 
any  modifications  to  the  existing  hardware.  It  does  however  assume  that  the  current  state-of-art  permits  arbitrary 
waveform  generation  and  adaptive  transmitting  capabilities  [38], 

Computational  complexity  The  inclusion  of  the  waveform  causes  the  correlation  matrices  to  have  larger 
dimension.  Inverting  large  matrices  are  computationally  prohibitive.  Classical  STAP  requires  inverting  a  complex 
ML  x  AIL  matrix  which  has  a  complexity  of  0((AfL)2'373)-0((ML)3)  [65].  Waveform  adaptive  STAP  requires 
inverting  complex  NAIL  x  NAIL  complex  matrices  which  has  a  computational  complexity  of  0((N AIL)2-3'3)- 
OfNAILf)  [65]. 

Training  data  Due  to  the  larger  dimensions  of  the  correlation  matrices  by  inclusion  of  the  waveform,  it  suddenly 
appears,  albeit  deceivingly,  that  more  training  data  (from  more  neighboring  range  cells)  are  needed  to  estimate  the 
correlation  matrices.  This  is  not  true  since  inclusion  of  waveform  simply  includes  the  fast  time  samples.  Hence  the 
fast-time  slow-time  model  uses  the  raw  data  prior  to  pulse  compression  or  matched  filtering,  hence  the  training  data 
requirements  is  identical  to  that  required  in  the  classical  STAP  case.  Note  that  we  are  not  interested  in  resolving 
targets  within  the  pulse  duration  but  rather  outside  it. 

IV.  Simulations 

First  we  will  addresses  simulations  not  specific  to  radar. 
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(C) 


(d) 


Fig.  7:  Simulations  supporting  Prop.  3,  x-axis  72.  Monotone  increasing  (a)  /( 72)  and  corresponding  (b)  72/(72).  Monotone 
decreasing  (c)  /( 72)  and  corresponding  (d)  72/(72). 


A.  Simulations  supporting:  Prop.  3  and  Rem.  7 

We  ran  simulations  with  random  zn  and  random  dn  to  analyze  /  (77 )  and  72/(72)  numerically.  In  our  extensive 
simulations  we  chose  zn  from  complex  normal  distributions  with  different  means  and  different  variances.  Since 
dn  >  0  for  all  n,  we  used  uniform  distributions  with  different  supports  on  the  positive  real  axis  excluding  zero.  We 
show  only  two  representative  simulation  results  for  the  monotonically  increasing  and  decreasing  cases  in  Fig.  7(a)(c), 
respectively.  The  corresponding  function  72/(72)  are  also  shown  in  Fig.  7(b)(d)  for  the  two  cases. 

Simulations  for  supporting  Rem.  7  is  presented  next.  Some  parameters  specifying  the  function  r("/e)  were 
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(a)  (b) 


(c) 


Fig.  8:  Simulations  supporting  Rem.  7,  x-axis  76.  Example  showing  one  zero  crossing  of  (a)  r( ^e)  and  corresponding  (b) 
72r(76).  Monotone  decreasing  example  for  P0  >>  k2  in  (c)  r( 76). 


simulated  randomly  with  the  identical  distributions  used  as  in  generating  Fig.  8.  The  parameter  k  =  2,  Pa  =  10  was 
used  in  generating  Fig.  8(a),  the  function  76^(76)  is  also  shown  in  Fig.  8(b).  As  such,  it  is  noted  that  Pa  =  10  is 
a  a  contrived  example,  typical  radar  applications  will  require  P0  to  be  in  several  hundred  KW  or  several  Hundred 
MW. 

The  zero  crossing  is  the  intersection  of  the  dashed  line  (black)  with  the  blue  curve  in  Fig.  8(a).  Now  using  Pa  =  20 
and  keeping  the  other  parameters  fixed  we  obtain  Fig.  8(c)  which  shows  that  r( 76 )  is  monotonic  decreasing  whose 
limit  at  00  is  0. 
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Fig.  9:  Constrained  alternating  minimization:  objective  costs  vs.  iterations  for  3  random,  independent  waveform  initializations 
(inset:  for  25  random  initializations). 


Fig.  10:  (a)  Constrained  alternating  minimization,  (b)  Proximal  constrained  alternating  minimization  (inset:  magnified),  minimum 
eigenvector  waveform  (dashed  black). 
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Fig.  11:  Convergence  of  non  const,  mod  initial  waveform  to  a  con.  mod  Con.  waveform: 
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Monte  Carlo  Trials 
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(b) 


Fig.  12:  Constant  modulus  waveform  design  comparison  with  non  const,  mod.  design,  200  trials  initialized  with:  (a)  random 
non-const,  mod.  Gaussian  waveforms  (b)  random  unit  modulus  waveforms,  with  phase  drawn  uniformly  from  [ — 7r,  7r] . 


Radar  Specific  simulations:  Here  onward,  some  parameters  are  common  to  all  the  simulation  examples  and 
are  stated  now.  The  simulation  parameters  are  in  SI  units  unless  mentioned  otherwise.  To  reduce  computation 
complexity  while  inverting  large  matrices  and  computing  their  eigen-decompositions,  we  considered  the  number  of, 
sensors,  waveform  transmissions,  and  fast  time  samples  in  the  waveform  as  M  =  5,  L  =  32,  N  =  5,  respectively. 
The  carrier  frequency  was  chosen  to  be  1GHz,  and  the  radar  bandwidth  was  50MHz.  The  element  spacing  d  =  A0/2. 
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Reed  Mallet  Brennan  Reed  Mallet  Brennan 


Fig.  13:  Oracle:  Reed  Mallet  Brennan  rule. 
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Fig.  14:  Adapted  patterns  using  designed  waveform  from  alternating  minimization,  dashed  line  is  the  Doppler  as  a  function  of 
angle  predicted  by  theory.  In,  (a)  no  clutter  Doppler  ambiguities,  (b)  clutter  Doppler  ambiguities  shown  with  arrows. 
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ROC  curves  ROC  curves 


(a)  (b) 

Fig.  15:  ROC  (a)  non  con.  mod  design,  (b)  con.  mod.  design 


Constrained  Proximal  Alternating  Minimization 
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Fig.  16:  Rank  deficient  waveform  adaptive  STAP  (a)  constrained  alternating  minimization,  (b)  constrained  proximal  alternating 
minimization. 
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B.  Constrained  alternating  minimization 

The  noise  correlation  matrix  was  assumed  to  have  a  correlation  function  given  by  exp(—  |0.005n|),  n  = 
0, 1, . . . ,  NML.  Two  interference  sources  were  considered  at  (9  =  0.3941, 0  =  0.3)  and  at  (—0.4941,0.3). 
Both  these  interference  sources  had  identical  discrete  correlation  functions  given  by  0.2'n',  n  =  ±0,  ±1, ....  To 
simulate  clutter  we  considered  two  clutter  patches,  consisting  of  five  scatters  each.  The  clutter  correlation  functions 
corresponding  to  the  two  patches  were  exp(— 0.2|p|)  and  exp(— 0.1|p|),  p  =  ±0,  ±1, . . . ,  ±P.  The  rest  of  the 
parameters  are  identical  to  those  used  in  [57], 

In  Fig.  9,  the  STAP  beamformer  objective  vs.  iterations  are  shown  for  3  independent,  random  waveform 
initializations  but  the  inset  shows  25  independent  initializations  or  trials.  The  alternating  minimization  was  initialized 
with  waveforms  whose  fast  time  samples  are  chosen  independently  from  a  standard  complex  Gaussian  distribution. 
The  algorithm  was  terminated  as  soon  as  the  current  waveform  iterate  invalidated  the  set  power  constraint.  From  the 
figure  and  its  inset  it  is  clear  that  the  STAP  beamformer  output  is  non-increasing  thereby  validating  the  monotonicity 
property  of  this  algorithm.  More  importantly  from  Fig.  9,  we  see  that  the  final  objective  value  and  the  iterations  to 
reach  it  for  each  trial  are  different  from  one  another,  attributed  to  the  joint  non-convexity  of  the  objective  w.r.t.  w 
and  s.  Sensitivity  to  the  random  initialization  is  therefore  duly  noted. 

C.  Constrained  proximal  alternating  minimization 

All  the  simulations  parameters  are  identical  to  the  previous  case.  The  constrained  alternating  minimization  was 
initialized  with  random  waveforms  as  in  Fig.  9,  immediately  followed  by  its  proximal  counterpart.  The  termination 
of  the  former  algorithm  was  identical  to  the  previous  case,  then,  the  latter  was  run  for  200  iterations.  Three 
representative  trials  are  shown  in  Fig.  10(a)(b),  for  the  constrained  alternating  minimization  and  its  proximal 
counterpart.  In  Fig.  10(b),  the  dashed  black  lines  are  the  final  objective  values  obtained  from  the  min.  eigenvector 
waveform  having  the  same  energy  as  its  proximal  counterpart.  For  the  three  trials  and  not  surprisingly,  the  proximal 
objective  value,  for  all  practical  purposes,  is  identical  to  that  obtained  from  the  waveform  derived  from  (21)  as 
evidenced  from  the  inset.  Therefore  validating  the  implementation  of  both  the  constrained  as  well  as  its  proximal 
counterpart.  From  Fig.  10(b)  and  unlike  Fig.  9,  three  accumulation  points  w.r.t.  the  objective  are  clearly  visible  for 
the  three  trials  indicating  strong  convergence. 

D.  Constant  modulus 

The  constant  modulus  algorithm  was  implemented  numerically  via  the  KKTs  (i.e.  (72))  and  using  the  results 
from  Prop.  5.  The  simulation  parameters  are  identical  to  the  two  previous  scenarios.  In  Fig.  11,  the  modulus  of  the 
fast  time  waveform  samples  vs.  iterations  are  shown  for  the  constant  modulus  alternating  minimization  algorithm. 
As  seen  from  this  figure,  the  algorithm  was  initialized  with  a  non-constant  modulus  waveform.  For  this  random 
initialization,  convergence  to  a  constant  modulus  is  achieved  in  three  iterations  or  less.  We  have  however  encountered 
cases  where  the  algorithm  has  not  converged  for  several  iterations.  Nevertheless  this  problem  was  not  encountered 
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when  the  algorithm  was  initialized  with  a  random  constant  modulus  waveform.  Thus  in  practice,  it  is  advocated 
that  this  algorithm  be  initialized  with  an  arbitrary  constant  modulus  waveform,  viz.  a  chirp,  rectangular  pulse,  etc.. 

The  ratio  of  the  final  objective  for  the  constant  modulus  algorithm  to  the  objective  for  the  non-constant  modulus 
waveform  design  using  the  constrained  alternating  minimization  is  seen  in  Fig.  12(a)(b)  for  200  random  waveform 
initializations.  After  convergence,  not  unexpectedly,  the  constant  modulus  objective  is  more  than  the  non-constant 
modulus  objective.  This  trend  is  readily  observed  from  Fig.  12(a)(b)  for  the  200  trials.  This  is  to  be  expected  since 
constant  modulus  waveforms  are  a  subset  of  their  non-constant  modulus  counterparts.  In  particular,  the  amplitude 
is  constrained  temporally  in  the  constant  modulus  design,  while  the  phase  is  allowed  to  be  optimized.  Whereas, 
the  phase  and  amplitude  are  both  optimized  the  non-constant  modulus  design.  From  these  figures  we  can  see  that 
on  one  end,  this  ratio  is  as  much  as  lOdB,  and  on  the  other  it  is  almost  (MB.  Nonetheless  on  the  average,  the 
non-const,  modulus  waveforms  have  lower  objective  values  than  objective  values  derived  from  the  const,  modulus 
waveforms. 


E.  Oracle  sample  support  requirements 


The  ideal  SINR  is 


Pt  lwf  (v(/d)®so®a(Ai|j,t))|2 
w^Ru(s0)w0 


where  w0)  sQ  are  obtained  after  optimization.  Using  the  estimated  co- 


variance  matrix,  say  the  sample  covariance  matrix,  the  definition  of  the  estimated  SINR  is 
where  Ru(-)  is  the  estimated  sample  covariance  matrix,  and  west,  sest  are  the  optimized  weight  and  waveform 
vectors  by  using  the  estimated  covariance  in  the  optimization  instead. 

A  true  SINR  loss  can  be  computed  by  using  the  estimated  i.e.  R77  in  (15)  and  running  the  optimization  algorithm 
for  each  Monte  Carlo  trial,  resulting  in  an  estimated  sest.  This  is  computationally  heavy  on  our  current  resources, 
therefore  not  reported  here.  However,  we  will  assume  that  an  oracle  has  provided  the  optimal  waveform  to  be 
transmitted.  Then  the  oracle  loss  of  SINR  due  to  the  estimated  covariance  is  a  random  variable,  captured  by. 


SINRloss  = 


wfRu(s0)w0 

wesA  u(s0)west 


Random  data  is  now  generated  from  zero  mean  multivariate  complex  Gaussian  distributions  to  compute  the  sample 
covariance  matrices,  i.e.  Ri,  Rn  and  Hi'7.  Two  hundred  Monte  Carlo  trials  were  run  with  differing  sample  supports. 
The  mean  and  standard  deviation  of  the  oracle  SINR\oss  are  shown  in  Fig.  13(a)(b).  Not  surprisingly  the  RMB 
rule  is  followed  perfectly.  For  the  same  sample  support,  the  standard  deviation  is  a  few  orders  less  than  the  mean. 


F.  Adapted  patterns 

The  adapted  pattern  for  the  waveform  dependent  STAP  objective  function  is  expressed  as 

V{fdl9)  =  |wf  (v(/d)  O  s0  O  a(6>,()>))|2,  for  a  fixed  0.  (76) 

The  adapted  pattern  in  (76)  is  a  function  of  angle,  Doppler,  the  optimal  weight  and  the  waveform  vectors,  w0,  s„, 
respectively.  Two  examples  are  shown  in  Fig.  14(a)(b).  Two  interferers  at  ( 6  =  — 0.2, 0  =  7r/3)  and  at  ( — 0.2, 7t/3) 
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were  chosen.  We  modeled  the  clutter  discretely  from  all  azimuth  angles  from  —  7t/2  to  7t/2  in  discrete  increments 
of  —  0.0057r/2  radians.  The  clutter  patches  were  fixed  at  an  elevation  angle  of  7t/4  radians.  The  target  was  assumed 
to  be  at  9t  =  0.7,  ft  =  7r/4  with  normalized  Doppler  equal  to  0.31  and  9t  =  0 ,ft  =  tt/4  with  normalized 
Doppler  equal  to  -0.4  in  Figl4(a)(b),  respectively.  The  adapted  patterns  in  Fig.  14  are  identical  (upto  a  scaling) 
to  those  obtained  from  the  classical  STAP  adapted  pattern.  This  is  not  a  surprise  but  is  rather  reassuring  since 
the  waveform  in  (76)  affects  all  the  Doppler  frequencies  and  the  azimuths  identically.  Moreover,  we  can  always 
consider  s0  0  a(d,  <i>)  as  a  new  /modified  spatial  steering  vector.  Hence  as  expected  the  inclusion  of  the  optimal 
waveform  will  not  alter  the  shape  of  the  classical  STAP  adapted  pattern. 

G.  Detection 

Here,  we  investigate  the  impact  of  detection  using  the  optimized  waveforms  and  randomly  selected  waveforms. 
The  detection  test  for  the  presence  of  a  target  at  a  particular  range  cell  is  cast  as  a  binary  hypothesis  test, 

H0  ■  wHy  =  wHyu  Hi  :  wHy  =  wHy  +  wHyu  (77) 

where  y,  yu  have  been  been  defined  in  (8),  (9).  Assuming  that  yu  is  complex  normal  distributed,  the  test  in  (77)  is 
readily  evaluated.  The  weight  vector  is  obtained  after  the  optimization.  The  ROC  curves  for  SINRs  OdB,  3dB  and 
6dB  are  shown  in  Fig.  15(a)(b)  for  the  non  const,  modulus  and  const,  modulus  design,  respectively.  For  generating 
Fig.  15(a),  a  random  waveform  was  used  having  the  same  energy  as  that  obtained  after  the  alternating  minimization 
algorithm.  The  waveform  samples  were  drawn  independently  from  a  complex  Gaussian  distribution.  In  Fig.  15(b), 
a  chirp  waveform  was  used  having  the  same  bandwidth  and  energy  as  its  optimized  constant  modulus  counterpart. 
From  these  figures  and  as  expected,  from  a  detection  standpoint,  an  optimized  waveform  performs  much  better  than 
transmitting  an  un-optimized  waveform. 

H.  Realistic  STAP  waveform  design 

We  consider  a  scenario  frequently  encountered  in  STAP,  the  sample  covariance  matrix  is  rank  deficient  due 
to  the  paucity  of  training  data.  The  simulation  parameters  are  identical  to  those  used  as  in  Fig.  9,  except  that  we 
considered  ground  clutter  from  all  azimuths  in  [ — 7r/2,  7t.2],  similar  to  those  used  in  generating  Fig.  14.  Furthermore, 
we  constrained  the  rank  of  the  resulting  correlation  matrices  to  be  30,  equal  to  the  numerical  rank  of  the  clutter 
correlation  matrix  for  generating  Fig.  16.  The  alternating  minimization  is  first  used  for  20  iterations  assuming  an 
arbitrary  diagonal  loading  factor  equal  to  100.  After  termination  of  this  algorithm,  the  proximal  algorithm  was 
employed  for  50  iterations.  The  results  are  shown  in  Fig.  16(a)(b).  It  is  noted  that  in  practice  the  ‘true’  min. 
eigenvector  cannot  be  computed  due  to  the  rank  deficiency.  Interestingly  nonetheless,  the  designed  waveforms  after 
the  proximal  optimization  result  in  a  STAP  objective  value  which  is  close  to  that  obtained  from  the  waveform 
estimated  from  the  ’true’  min.  eigenvector.  However,  extensive  simulations  for  the  rank  deficient  STAP  are  needed 
to  verify  if  this  behavior  is  seen  for  other  classes  of  noise  plus  interference,  and  clutter  correlation  matrices. 
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V.  Conclusions 

Waveform  design  in  STAP  was  the  focus  of  this  report  assuming  the  dependence  of  the  clutter  response  on  the 
transmitted  waveform.  Our  preliminary  simulations  indicate  that  the  objective  function  was  jointly  non-convex  in  the 
weight  and  waveform  vectors.  However,  we  showed  analytically  that  the  objective  function  is  individually  convex 
in  the  waveform  and  the  weight  vector.  This  motivated  a  constrained  alternating  minimization  technique  which 
iteratively  optimizes  one  vector  while  keeping  the  other  fixed.  A  constrained  proximal  alternating  minimization 
technique  was  propose  to  handle  rank  deficient  STAP  correlation  matrices.  To  addresses  practical  design  constraints 
we  incorporated  constant  modulus  constraints  in  our  alternating  minimization  formulation.  Simulations  were  cho¬ 
sen  to  demonstrate  the  monotonic  decrease  of  the  MVDR  objective  function  using  this  alternating  minimization 
algorithm.  Preliminary  simulations  were  presented  to  validate  the  theory. 

Acknowledgment 

This  work  was  sponsored  by  US  AFOSR  under  project  13RY10COR.  All  views  and  opinions  expressed  here  are 
the  authors  own  and  does  not  constitute  endorsement  from  the  Department  of  Defense  or  the  USAF. 

References 

[1]  R.  Klemm,  Principles  of  Space-Time  Adaptive  Processing.  Institution  of  Electrical  Engineers,  2002. 

[2]  J.  Ward,  Space-time  Adaptive  Processing  for  Airborne  Radar ,  ser.  Technical  report  (Lincoln  Laboratory).  Massachusetts 
Institute  of  Technology.  Lincoln  Laboratory,  1994. 

[3]  J.  Guerci,  Space-Time  Adaptive  Processing  for  Radar.  Artech  House,  2003. 

[4]  L.  E.  Brennan  and  L.  S.  Reed.  “Theory  of  Adaptive  Radar,”  IEEE  Transactions  on  Aerospace  and  Electronic  Systems ,  vol. 
AES-9,  no.  2,  pp.  237-252,  Mar.  1973. 

[5]  D.  Madurasinghe  and  A.  P.  Shaw,  “Mainlobe  jammer  nulling  via  tsi  finders:  a  space  fast-time  adaptive  processor,”  EURASIP 
J.  Appl.  Signal  Process.,  vol.  2006.  pp.  221-221,  Jan.  2006. 

[6]  Y.  Seliktar,  D.  B.  Williams,  and  E.  J.  Holder,  “A  space/fast-time  adaptive  monopulse  technique,”  EURASIP  J.  Appl.  Signal 
Process.,  vol.  2006,  pp.  218-218,  Jan.  2006. 

[7]  J.  Capon,  “High-resolution  frequency-wavenumber  spectrum  analysis,”  Proceedings  of  the  IEEE,  vol.  57,  no.  8,  pp.  1408- 
1418,  Jun.  1969. 

[8]  M.  J.  D.  Powell,  “An  efficient  method  for  finding  the  minimum  of  a  function  of  several  variables  without  calculating 
derivatives,”  The  Computer  Journal,  vol.  7,  no.  2,  pp.  155-162,  1964. 

[9]  M.  Powell,  “On  search  directions  for  minimization  algorithms,”  Mathematical  Programming,  vol.  4,  no.  1,  pp.  193-201, 
1973. 

[10]  W.  I.  Zangwill,  “Minimizing  a  function  without  calculating  derivatives,”  The  Computer  Journal,  vol.  10,  no.  3,  pp.  293-296, 
1967. 

[11]  J.  Ortega  and  W.  Rheinboldt,  Iterative  solution  of  nonlinear  equations  in  several  variables.  Academic  Press,  1970. 

[12]  Z.  Luo  and  P.  Tseng,  “On  the  convergence  of  the  coordinate  descent  method  for  convex  differentiable  minimization,” 
Journal  of  Optimization  Theory  and  Applications,  vol.  72,  no.  1,  pp.  7-35,  1992. 


43 

Approved  for  public  release;  distribution  unlimited. 


P.  SETLUR  AND  M.  RANGASWAMY:  AFRL  SENSORS  DIRECTORATE  TECH.  REPORT.  2014. 


44 


[13]  A.  Auslender,  “Asymptotic  properties  of  the  Fenchel  dual  functional  and  applications  to  decomposition  problems,”  J.  Optim. 
Theory  Appl.,  vol.  73,  no.  3,  pp.  427-449,  Jun.  1992. 

[14]  D.  P.  Bertsekas,  Nonlinear  Programming,  2nd  ed.  Athena  Scientific,  1999. 

[15]  L.  Grippo  and  M.  Sciandrone,  “On  the  convergence  of  the  block  nonlinear  gauss-seidel  method  under  convex  constraints,” 
Oper.  Res.  Lett.,  vol.  26,  no.  3,  pp.  127-136,  Apr.  2000. 

[16]  H.  Attouch,  J.  Bolte,  P.  Redont,  and  A.  Soubeyran,  “Proximal  alternating  minimization  and  projection  methods  for 
nonconvex  problems:  An  approach  based  on  the  Kurdyka-Lojasiewicz  inequality,”  Math.  Oper.  Res.,  vol.  35,  no.  2,  pp. 
438-457,  May  2010. 

[17]  A.  Beck,  “On  the  convergence  of  alternating  minimization  with  applications  to  iteratively  reweighted  least  squares  and 
decomposition  schemes,”  Optimization  Online,  2013. 

[18]  P.  Setlur  and  M.  Rangaswamy,  “Proximal  constrained  waveform  design  algorithms  for  cognitive  radar  stap,”  in  Asilomar 
Conference  on  Signals,  Systems  and  Computers,  Nov  2014. 

[19]  B.  Martinet,  “Breve  communication,  regularisation  d’inequations  variationnelles  par  approximations  successives,”  ESAIM: 
Mathematical  Modelling  and  Numerical  Analysis  -  Modelisation  Mathematique  et  Analyse  Numerique,  vol.  4,  no.  R3,  pp. 
154-158,  1970. 

[20]  R.  Rockafellar,  “A  dual  approach  to  solving  nonlinear  programming  problems  by  unconstrained  optimization,”  Mathematical 
Programming,  vol.  5,  no.  1,  pp.  354—373,  1973. 

[21]  D.  P.  Bertsekas  and  P.  Tseng,  “Partial  proximal  minimization  algorithms  for  convex  programming,”  SIAM  Journal  on 
Optimization,  vol.  4,  pp.  551-572,  1994. 

[22]  R.  T.  Rockafellar,  “Monotone  operators  and  the  proximal  point  algorithm,”  SIAM  Journal  on  Control  and  Optimization, 
vol.  14,  no.  5,  pp.  877-898,  1976. 

[23]  P.  Combettes  and  J.-C.  Pesquet,  “Proximal  splitting  methods  in  signal  processing,”  in  Fixed-Point  Algorithms  for  Inverse 
Problems  in  Science  and  Engineering,  ser.  Springer  Optimization  and  Its  Applications,  H.  H.  Bauschke,  R.  S.  Burachik, 
P.  L.  Combettes,  V.  Elser,  D.  R.  Luke,  and  H.  Wolkowicz,  Eds.  Springer  New  York,  2011,  pp.  185-212. 

[24]  N.  Parikh  and  S.  Boyd,  Proximal  Algorithms,  ser.  Foundations  and  Trends(r)  in  Optimization.  Now  Publishers  Incorporated, 
2013. 

[25]  P.  Woodward  and  I.  Davies,  “Information  theory  and  inverse  probability  in  telecommunication,”  Proceedings  of  the  IEE-Part 
III:  Radio  and  Communication  Engineering,  vol.  99,  no.  58,  pp.  37-44,  1952. 

[26]  P.  Woodward,  “Theory  of  radar  information,”  Information  Theory,  IRE  Professional  Group  on,  vol.  1,  no.  1,  pp.  108-113, 
1953. 

[27]  - ,  Probability  and  information  theory,  with  applications  to  radar.  Pergamon  press  London,  1953. 

[28]  J.  Benedetto,  I.  Konstantinidis,  and  M.  Rangaswamy,  “Phase-coded  waveforms  and  their  design,”  IEEE  Signal  Processing 
Magazine,  vol.  26,  no.  1,  pp.  22-31,  Jan  2009. 

[29]  D.  DeLong  and  E.  Hofstetter,  “On  the  design  of  optimum  radar  waveforms  for  clutter  rejection,”  IEEE  Trans.  Inf.  Theory, 
vol.  13,  no.  3,  pp.  454-463,  1967. 

[30]  - ,  “The  design  of  clutter-resistant  radar  waveforms  with  limited  dynamic  range,”  IEEE  Trans.  Inf.  Theory,  vol.  15, 

no.  3,  pp.  376-385,  May  1969. 

[31]  D.  DeLong,  “Design  of  radar  signals  and  receivers  subject  to  implementation  errors,”  IEEE  Trans.  Inf.  Theory,  vol.  16, 
no.  6,  pp.  707-711,  Nov  1970. 


44 

Approved  for  public  release;  distribution  unlimited. 


P.  SETLUR  AND  M.  RANGASWAMY:  AFRL  SENSORS  DIRECTORATE  TECH.  REPORT.  2014. 


45 


[32]  S.  Kay,  “Optimal  signal  design  for  detection  of  gaussian  point  targets  in  stationary  gaussian  clutter/reverberation,”  IEEE 
Jour.  Sel.  Top.  Signal  Proc.,  vol.  1,  no.  1,  pp.  31-41,  2007. 

[33]  C.-Y.  Chen  and  P.  Vaidyanathan,  “MIMO  radar  waveform  optimization  with  prior  information  of  the  extended  target  and 
clutter,”  TransSP,  vol.  57,  no.  9,  pp.  3533-3544,  2009. 

[34]  S.  Pillai,  H.  Oh.  D.  Youla,  and  J.  Guerci,  “Optimal  transmit-receiver  design  in  the  presence  of  signal-dependent  interference 
and  channel  noise,”  IEEE  Trans.  Inf.  Theory,  vol.  46,  no.  2,  pp.  577-584,  2000. 

[35]  A.  Aubry,  A.  De  Maio,  M.  Piezzo,  A.  Farina,  and  M.  Wicks,  “Cognitive  design  of  the  receive  filter  and  transmitted  phase 
code  in  reverberating  environment,”  IET  Radar  Sonar  and  Navig.,  vol.  6,  no.  9,  pp.  822-833,  December  2012. 

[36]  A.  Aubry,  A.  DeMaio,  A.  Farina,  and  M.  Wicks,  “Knowledge-aided  (potentially  cognitive)  transmit  signal  and  receive  filter 
design  in  signal-dependent  clutter,”  IEEE  Trans.  Aerospace  and  Electronic  Systems,  vol.  49,  no.  1,  pp.  93-117,  Jan  2013. 

[37]  G.  Cui,  H.  Li,  and  M.  Rangaswamy,  “MIMO  radar  waveform  design  with  constant  modulus  and  similarity  constraints,” 
IEEE  Trans.  Signal  Processing,  vol.  62,  no.  2,  pp.  343-353,  Jan  2014. 

[38]  D.  Cochran,  S.  Suvorova,  S.  Howard,  and  W.  Moran,  “Waveform  libraries:  Measures  of  effectiveness  for  radar  scheduling,” 
IEEE  Signal  Processing  Magazine,  vol.  26,  no.  1,  pp.  12-21,  2009. 

[39]  P.  Setlur,  T.  Negishi,  N.  Devroye,  and  D.  Erricolo,  “Multipath  exploitation  in  non-los  urban  synthetic  aperture  radar,”  IEEE 
Jour.  Selected  Top.  Sign.  Proc.,  vol.  8.  no.  1,  pp.  137-152,  Feb  2014. 

[40]  J.  Guerci  and  E.  Baranoski,  “Knowledge-aided  adaptive  radar  at  DARPA:  an  overview,”  IEEE  Signal  Processing  Magazine, 
vol.  23,  no.  1,  pp.  41-50,  Jan  2006. 

[41]  D.  Middleton.  An  introduction  to  statistical  communication  theory:  An  IEEE  press  classic  reissue.  Piscataway,  NJ,  USA: 
IEEE  Press,  1996. 

[42]  H.  Van  Trees,  Detection,  Estimation,  and  Modulation  Theory,  ser.  Detection,  Estimation,  and  Modulation  Theory.  Wiley, 
2004,  no.  pt.  1. 

[43]  P.  Stoica,  H.  He,  and  J.  Li,  “Optimization  of  the  receive  filter  and  transmit  sequence  for  active  sensing,”  IEEE  Trans. 
Signal  Processing,  vol.  60,  no.  4,  pp.  1730-1740,  April  2012. 

[44]  L.  Patton,  D.  Hack,  and  B.  Himed,  “Adaptive  pulse  design  for  space-time  adaptive  processing,”  in  IEEE  Sensor  Array  and 
Multichannel  Signal  Processing  Workshop  (SAM),  June  2012,  pp.  25-28. 

[45]  P.  Setlur  and  N.  Devroye,  “Adaptive  waveform  scheduling  in  radar:  an  information  theoretic  approach,”  in  In  Proc.  SPIE, 
Defense  Security  and  Sensing,  Symp. ,  vol.  8361,  2012,  pp.  836  103-836  103-11. 

[46]  P.  Setlur,  N.  Devroye,  and  Z.  Cheng,  “Waveform  scheduling  via  directed  information  in  cognitive  radar,”  in  IEEE  Statistical 
Signal  Processing  Workshop  (SSP),  Aug  2012,  pp.  864-867. 

[47]  M.  Bell,  “Information  theory  and  radar  waveform  design,”  IEEE  Trans.  Inf.  Theory,  vol.  39,  no.  5,  pp.  1578-1597,  1993. 

[48]  - ,  “Information  theory  and  radar:  mutual  information  and  the  design  and  analysis  of  radar  waveforms  and  systems,” 

Ph.D.  dissertation,  Caltech,  1988. 

[49]  A.  Leshem,  O.  Naparstek,  and  A.  Nehorai,  “Information  theoretic  adaptive  radar  waveform  design  for  multiple  extended 
targets,”  IEEE  Jour.  Selected  Top.  Sign.  Proc.,  vol.  1,  no.  1,  pp.  42-55,  June  2007. 

[50]  R.  Romero  and  N.  Goodman,  “Information-theoretic  matched  waveform  in  signal  dependent  interference,”  in  IEEE  Radar- 
Conference,  2008,  pp.  1-6. 

[51]  W.  Moran,  S.  Suvorova,  and  S.  Howard,  “Applications  of  sensor  scheduling  concepts  to  radar,”  in  Foundations  and 
Applications  for  Sensor  Management,  A.  Hero,  D.  Castanon,  D.  Cochran,  and  K.  Kastella,  Eds.  Springer- Verlag,  2006, 
pp.  221-256. 


45 

Approved  for  public  release;  distribution  unlimited. 


P.  SETLUR  AND  M.  RANGASWAMY:  AFRL  SENSORS  DIRECTORATE  TECH.  REPORT.  2014. 


46 


[52]  S.  Sira,  A.  Papandreou-Suppappola,  D.  Morrell,  and  D.  Cochran,  “Waveform-agile  sensing  for  tracking  multiple  targets  in 
clutter,”  in  2006  40th  Annual  Conference  on  Information  Sciences  and  Systems,  2006,  pp.  1418-1423. 

[53]  S.  P.  Sira,  Y.  Li,  A.  Papandreou-Suppappola,  D.  Morrell,  D.  Cochran,  and  M.  Rangaswamy,  “Waveform-agile  sensing  for 
tracking,”  IEEE  Signal  Processing  Magazine ,  vol.  26,  no.  1,  pp.  53-64,  2009. 

[54]  D.  Kershaw  and  R.  Evans,  “Optimal  waveform  selection  for  tracking  systems,”  IEEE  Trans.  Inf  Theory,  vol.  40,  no.  5, 
pp.  1536-1550,  Sep.  1994. 

[55]  - ,  “Waveform  selective  probabilistic  data  association,”  Aerospace  and  Electronic  Systems,  IEEE  Transactions  on,  vol.  33, 

no.  4,  pp.  1180-1188,  1997. 

[56]  J.  Li,  L.  Xu,  P.  Stoica,  K.  Forsythe,  and  D.  Bliss,  “Range  compression  and  waveform  optimization  for  MIMO  radar:  A 
Cramer  Rao  bound  based  study,”  Signal  Processing,  IEEE  Transactions  on,  vol.  56,  no.  1,  pp.  218-232,  Jan  2008. 

[57]  P.  Setlur,  N.  Devroye,  and  M.  Rangaswamy,  “Waveform  design  and  scheduling  in  space-time  adaptive  radar,”  in  In  proc. 
IEEE  Radar  Conference,  2013. 

[58]  R.  Horn  and  C.  Johnson,  Matrix  Analysis.  Cambridge  University  Press,  2005. 

[59]  A.  Hjorungnes  and  D.  Gesbert,  “Complex-valued  matrix  differentiation:  Techniques  and  key  results,”  IEEE  Trans.  Signal 
Processing,  vol.  55,  no.  6,  pp.  2740-2746,  June  2007. 

[60]  S.  Boyd  and  L.  Vandenberghe,  Convex  Optimization.  New  York,  NY,  USA:  Cambridge  University  Press,  2004. 

[61]  J.-B.  Lasserre,  “A  trace  inequality  for  matrix  product,”  IEEE  Trans,  on  Automatic  Control,  vol.  40,  no.  8,  pp.  1500-1501, 
Aug  1995. 

[62]  J.  J.  More,  “Generalizations  of  the  trust  region  problem,”  Optimization  Methods  and  Software,  vol.  2,  pp.  189-209,  1993. 

[63]  S.  Kay,  Fundamentals  of  Statistical  Signal  Processing,  Volume  1:  Estimation  Theory.  Prentice  Hall,  1998. 

[64]  P.  Setlur  and  M.  Rangaswamy,  “Signal  dependent  clutter  waveform  design  for  radar  stap,”  in  In  Proc.  IEEE  Radar 
Conference,  2014. 

[65]  V.  V.  Williams,  “Multiplying  matrices  faster  than  Coppersmith-Winograd,”  in  Proceedings  of  the  Forty-fourth  Annual  ACM 
Symposium  on  Theory  of  Computing,  ser.  STOC  ’12,  2012,  pp.  887-898. 


46 

Approved  for  public  release;  distribution  unlimited. 


